|
Analytics, Cloud-Computing, and Crowdsourcing - or How To Destroy My Job...| old_uid | 11681 |
|---|
| title | Analytics, Cloud-Computing, and Crowdsourcing - or How To Destroy My Job... |
|---|
| start_date | 2012/10/04 |
|---|
| schedule | 17h |
|---|
| online | no |
|---|
| location_info | 2600:105 |
|---|
| summary | We are witnessing the resurgence of analytics as a key differentiator for creating new services, the emergence of cloud computing as a disrupting technology for service delivery, and the growth of crowdsourcing as a new phenomenon in which people play critical roles in creating information and shaping decisions in a variety of problems. After introducing the first two (well-known) concepts, we will analyze some of the opportunities created by the advent of crowdsourcing.
Then, we will explore the intersections of these three concepts. We will examine their evolution from the optics of a professional machine-learning researcher and try to understand how his job and roles have evolved over time.
In the past, analytic model building was an artisanal process, as models were handcrafted by an experienced, knowledgeable model-builder. More recently, the use of meta-heuristics (such as evolutionary algorithms) has provided us with limited levels of automation in model building and maintenance. In the not so distant future, we expect analytic models to become a commodity. We envision having access to a large number of data-driven models, obtained by a combination of crowdsourcing, crowdservicing, cloud-based evolutionary algorithms, outsourcing, in-house development, and legacy models.
In this new context, the critical issue will be model ensemble selection and fusion, rather than model generation. We address this issue by proposing customized model ensembles on demand, inspired by Lazy Learning. In our approach, referred to as Lazy Meta-Learning, for a given query we find the most relevant models from a DB of models, using their meta-information. After retrieving the relevant models, we select a subset of models with highly uncorrelated errors (unless diversity was injected in their design process.) With these models we create an ensemble and use their meta-information for dynamic bias compensation and relevance weighting. The output is a weighted interpolation or extrapolation of the outputs of the models ensemble. The confidence interval around the output is reduced as we increase the number of uncorrelated models in the ensemble. This approach is agnostic with respect to the genesis of the models, making it scalable and suitable for a variety of applications. We have successfully tested this approach in a regression problem for a power plant management application, using two different sources of models: bootstrapped neural networks, and GP-created symbolic regressors evolved on a cloud. |
|---|
| responsibles | Revault d'Allonnes |
|---|
| |
|