|
Data quality for low-resource MT| old_uid | 19293 |
|---|
| title | Data quality for low-resource MT |
|---|
| start_date | 2021/07/02 |
|---|
| schedule | 16h |
|---|
| online | no |
|---|
| summary | In this talk I will present the findings of a collaborative audit of multilingual corpora, with special attention for low-resourced languages. We will discuss the challenges that come with building such corpora, and the risks of using them without inspection. With a case study on a subset of African languages I will illustrate the implications of building machine translation on low-quality parallel data. |
|---|
| responsibles | Seddah |
|---|
| |
|