On automatic creation of lexical semantic questionnaires

old_uid16420
titleOn automatic creation of lexical semantic questionnaires
start_date2018/09/27
schedule14h-15h30
onlineno
location_infosalle Ennat Léger
detailsdans le cadre HELAN2
summaryWe propose creating questionnaires for lexical typology (in the spirit of Rakhilina and Reznikova 2016) in an automatized fashion. We evaluate our system on questionnaire creation for 'smooth', 'sharp', 'thick', and 'straight' (object features often but not always expressed by adjectives), and perform a quantitative and qualitative analysis of the results. Our algorithm consists of the following steps: 1) extracting a list of frequent phrases, or bigrams, of the form adjective + noun ; 2) computing a co-occurrence-based vector representation for every noun phrase; 3) clustering the vector space; 4) extracting three core elements from the each cluster while eliminating all clusters containing less than three elements. This algorithm allows revealing semantic oppositions that indeed are typologically relevant. For example, many languages distinguish lexically sharp edges (e.g. knives) and sharp points (e.g. arrows) , having two distinct adjectives with the meaning sharp : one for the first sense, another for the second one (compare tranchant/aiguisÈ vs. pointu in French). There is no such distinction in Russian; still, Russsian noun phrases illustrating these context types fall into two different clusters (ostryj no ~ sharp knife , ostryj no ~ik sharp little knife , ostroje lezvije sharp blade vs. ostraja strela sharp arrow , ostroje kop Î sharp spear , ostryj kamen sharp stone ).
responsiblesCoupé