|
Provenance and Computational Reproducibility| old_uid | 17385 |
|---|
| title | Provenance and Computational Reproducibility |
|---|
| start_date | 2019/02/11 |
|---|
| schedule | 10h30-13h |
|---|
| online | no |
|---|
| location_info | Room Gilles Kahn |
|---|
| summary | Data-driven exploration has revolutionized science, industry and government alike. The abundance of data coupled with cheap and widely-available computing and storage resources has created a perfect storm that enabled this revolution. Now, the main bottleneck lies with people. To extract actionable insight from data, complex computational processes are required that are not only hard to assemble but that can also behave (and break) in unforeseen ways. Thus, when results are derived, an important question is whether you can trust them.
In this talk, I discuss the importance of maintaining detailed provenance (also referred to as lineage and pedigree) for data and computations. I will give an overview of techniques for capturing, managing, and re-using provenance information, and describe emerging applications and novel uses of provenance in collaborative data analysis, teaching science, and publishing reproducible results. Through concrete examples, I will also show that, besides providing important documentation that is key to preserve data, to determine the data's quality, reproduce and validate results, provenance can also be used to streamline the data exploration process. |
|---|
| responsibles | Zweigenbaum |
|---|
| |
|