Tensor factorization methods for lexical acquisition and the modeling of semantic compositionality

old_uid14979
titleTensor factorization methods for lexical acquisition and the modeling of semantic compositionality
start_date2015/01/23
schedule11h-12h30
onlineno
location_infosalle 255
summaryIn this talk, we will look at a number of tensor factorization methods for the modeling of language data. First, we present a method for the joint unsupervised aquisition of verb subcategorization frame (SCF) and selectional preference (SP) information. Treating SCF and SP induction as a multi-way co-occurrence problem, we use multi-way tensor factorization to cluster frequent verbs from a large corpus according to their syntactic and semantic behaviour. The method is able to predict whether a syntactic argument is likely to occur with a verb lemma (SCF) as well as which lexical items are likely to occur in the argument slot (SP). Secondly, we present a method for the computation of semantic compositionality within a distributional framework. We use our method to model the composition of subject verb object triples. The key idea is that compositionality is modeled as a multi-way interaction between latent factors, which are automatically constructed from corpus data. The method consists of two steps. First, we compute a latent factor model for nouns from standard co-occurrence data. Next, the latent factors are used to induce a latent model of three-way subject verb object interactions. By treating language data as multi-way co-occurrence frequencies, both methods are able to properly model the tasks at hand in an entirely unsupervised way.
responsiblesCandito