|
Statistical phenomena in data selection and data enrichment| title | Statistical phenomena in data selection and data enrichment |
|---|
| start_date | 2024/06/27 |
|---|
| schedule | 11h30-12h30 |
|---|
| online | no |
|---|
| location_info | Amphi Jaurès |
|---|
| summary | Building powerful machine learning models and training them has become increasingly possible thanks to new architecture, software infrastructure, and the prevalence of foundation models. Nowadays, developing a high-quality dataset for a specific use case of interest is often the key bottleneck to successful machine learning applications. I will discuss two approaches towards alleviating this problem: selecting highly informative samples from a large dataset, and merging a small data sat with surrogate data from a different source. In will overview some of the ideas in the literature on this problem, and present some findings arising from the analysis of simple statistical models. [Based on joint work with Germain Kolossov, Ayush Jain, Eren Sasoglu, Pulkit Tandon] |
|---|
| responsibles | Loureiro, Lorenzi, Peyré, Biroli, Mallat |
|---|
Workflow history| from state (1) | to state | comment | date |
| submitted | published | | 2024/07/09 12:55 UTC |
| |
|