Linguistic Universals in Grammars and Language Models

titleLinguistic Universals in Grammars and Language Models
start_date2025/02/12
schedule15h30-16h30
onlineno
location_infoen ligne
summaryThe shared universal properties of human languages have been at the heart of many linguistic debates for decades. A big part of these debates are two core questions: (1) learnability — can all linguistic universals be learned from data alone without any inbuilt prior knowledge, and (2) explanation — why do we see these universals and not some other? In the first part of the talk, I will show recent results on how LLMs fare in picking up a syntactic universal in the idealized scenario where LLM is trained on large amounts of data that comes from a large number of languages. As usual, the number of parameters and amount of data helps but does not fully solve the learnability problem. Even if LLMs could learn a syntactic universal, their performance alone would not help in explaining why the observed syntactic universal exists at the first place. In the second part of the talk, I will show how CCG syntactic theory can provide not only an explanation of why some universals exist but also a prediction of what word orders we will not find in human languages.
responsiblesBernard