|
On automatic creation of lexical semantic questionnaires| old_uid | 16420 |
|---|
| title | On automatic creation of lexical semantic questionnaires |
|---|
| start_date | 2018/09/27 |
|---|
| schedule | 14h-15h30 |
|---|
| online | no |
|---|
| location_info | salle Ennat Léger |
|---|
| details | dans le cadre HELAN2 |
|---|
| summary | We propose creating questionnaires for lexical typology (in the spirit of Rakhilina and Reznikova 2016) in an automatized fashion. We evaluate our system on questionnaire creation for 'smooth', 'sharp', 'thick', and 'straight' (object features often but not always expressed by adjectives), and perform a quantitative and qualitative analysis of the results. Our algorithm consists of the following steps: 1) extracting a list of frequent phrases, or bigrams, of the form adjective + noun ; 2) computing a co-occurrence-based vector representation for every noun phrase; 3) clustering the vector space; 4) extracting three core elements from the each cluster while eliminating all clusters containing less than three elements. This algorithm allows revealing semantic oppositions that indeed are typologically relevant. For example, many languages distinguish lexically sharp edges (e.g. knives) and sharp points (e.g. arrows) , having two distinct adjectives with the meaning sharp : one for the first sense, another for the second one (compare tranchant/aiguisÈ vs. pointu in French). There is no such distinction in Russian; still, Russsian noun phrases illustrating these context types fall into two different clusters (ostryj no ~ sharp knife , ostryj no ~ik sharp little knife , ostroje lezvije sharp blade vs. ostraja strela sharp arrow , ostroje kop Î sharp spear , ostryj kamen sharp stone ). |
|---|
| responsibles | Coupé |
|---|
| |
|