Language technology for everyone

old_uid15667
titleLanguage technology for everyone
start_date2015/05/22
schedule15h
onlineno
location_infosalle 118
summaryHigh-quality NLP tools - from tokenization to semantic parsers - exist for 10-15 of the world’s seven thousand languages, of which we have digital texts for at least a quarter. Even for the major languages, such as English, our tools only fair reasonably well on standard language, and not on informal language or dialect. We even see gender and age biases affect our tools’ performance. In addition our tools often over-fit arbitrary annotation choices, arguably making them even less robust to lingustic diversity. This talk surveys recent efforts in the COASTAL group to bridge these gaps.
responsiblesCandito