|
Controlling Linguistic Variability in Large Language Models| title | Controlling Linguistic Variability in Large Language Models |
|---|
| start_date | 2026/04/10 |
|---|
| schedule | 11h |
|---|
| online | no |
|---|
| location_info | visioconférence Big Blue Button |
|---|
| summary | Large language models (LLMs) have a considerable impact on Natural Language Processing and, more broadly, on our society. Each of their limitations can therefore have major consequences. I propose to study their generalization across several phenomena of linguistic variability: how LLMs learn to model linguistic variability, and how to control it. The stakes of this question are multiple: theoretical, on one hand, since LLM training can be contrasted with language acquisition in humans; social, on the other hand, since language varies according to multiple sociolinguistic factors. Among the different linguistic levels of variability, I will begin by studying three: 1. morphological, when several affixes are in competition; 2. intralinguistic, for variability between dialects of the same language; 3. interlinguistic, for code-switching between multiple languages. For each level, I will analyze the probability that an LLM assigns to each variant, which depends on its calibration. Control of variability can then be achieved by modifying: the model's input, the model itself according to different learning methods, or the decoding method. |
|---|
| responsibles | Bawden |
|---|
Workflow history| from state (1) | to state | comment | date |
| submitted | published | | 2026/04/07 07:40 UTC |
| |
|