search

actions - event

state: published

German Text Simplification: Scarce Data and Other Challenges

title	German Text Simplification: Scarce Data and Other Challenges
start_date	2023/12/01
schedule	14h
online	no
location_info	Doyen 22 & via Teams
summary	Text simplification is an intra-lingual translation task in which documents or sentences of a complex source text are simplified for a specific target audience. Many new models for text simplification have been proposed in recent years and months, but unfortunately, we often cannot be very sure of their quality. In most cases, we know too little about the training data and what kind of simplification we can expect from the models. In addition, we too often rely on controversial automatic evaluations, especially in languages other than English. In our view, the success of automatic text simplification systems depends as much or even more on the quality of the parallel data used for training and evaluation than on the text simplification models themselves. This talk will look at each point of the text simplification pipeline, particularly the data and annotation aspect, and discuss how it could be improved. For example, it will include i) facilitating the construction of new high-quality text simplification corpora, ii) improving existing corpora through new annotations, including annotations of a) simplification operations, b) quality assessment, and c) error operations, and iii) rethinking the current evaluation process. We will illustrate the problematic areas using German texts as an example.
responsibles	Rolin

Workflow history

from state (1)	to state	comment	date
submitted	published		2023/11/29 13:14 UTC

hosted_by

Université Catholique de Louvain

speakers

event_of

Traitement automatique du langage (séminaire du Centre de- (CENTAL), Institut Langage et Communication, UCLouvain, Louvain-La-Neuve, Belgique) (2023)

Event #875687 - created on 2023/11/16