|
Computational morphology needs better lexical data| title | Computational morphology needs better lexical data |
|---|
| start_date | 2025/04/16 |
|---|
| schedule | 16h30-17h30 |
|---|
| online | no |
|---|
| location_info | En ligne |
|---|
| summary | Rich resources are key to support comparative studies in computational linguistics. Yet existing datasets face a number of issues, among which problems of:
– coverage: current resources document only a small proportion of the worlds languages
– commensurability: it is rarely straightforward to compare resources
– consistent presentation: many datasets fall short of machine readability due to small variations in coding
– durability: project funding is temporary and data maintenance beyond their term is rarely ensured
– technical skills: good data management require technical skills which are rarely taught to linguists.
I illustrate this potential and these issues on the case of inflected resources for quantitative morphology. I outline a path to improvement through standardisation (specifically the Paralex standard: http://www.paralex-standard.org) and large scale international coordination. |
|---|
| responsibles | Bernard |
|---|
Workflow history| from state (1) | to state | comment | date |
| submitted | published | | 2025/04/22 12:46 UTC |
| |
|