Intervention sans nom (manifestation: Recherche en linguistique informatique (séminaire Alpage))

old_uid9918
start_date2011/05/06
schedule11h
onlineno
summaryWikipedia provides a repository for world knowledge with more structure than the web and more coverage than manually created knowledge bases. Although its system of categories can be used straightforwardly as a semantic network, the Wikipedia categorization cannot be considered a proper taxonomy, as the relations between categories are not semantically typed. In this presentation we will show how to induce an isa hierarchy on top of the Wikipedia categorization. We start by taking the category system in Wikipedia as a conceptual network. We then label the semantic relations between categories using methods based on connectivity in the network and lexico-syntactic matching. As a result we are able to derive a large scale taxonomy with isa relations between the concepts. We evaluate the quality of the taxonomy by comparing it with ResearchCyc, one of the largest manually created ontologies, and show that the Wikipedia derived taxonomy compares favorably with it. We also discuss experiments on using Wikipedia for computing the semantic similarity of words. The Wikipedia derived taxonomy performs as well as measures using WordNet, a commonly used lexical database in Natural Language Processing. We conclude with a view on current work which includes labeling additional relations such as part-of, location and temporal ones and creating a multilingual conceptual network. Publications relevant for this presentation: Ponzetto, Simone Paolo; Strube, Michael (2011). Taxonomy induction based on a collaboratively built knowledge repository In: Artificial Intelligence, to appear. (Short and slightly outdated version: Ponzetto, Simone Paolo; Strube, Michael (2007). Deriving a large scale taxonomy from Wikipedia. In: AAAI '07, pp.1440-1445.) Ponzetto, Simone Paolo; Strube, Michael (2007). Knowledge derived from Wikipedia for computing semantic relatedness. In: Journal of Artificial Intelligence Research 30, pp.181-212. (Short and outdated version: Strube, Michael; Ponzetto, Simone Paolo (2006). WikiRelate! Computing semantic relatedness using Wikipedia. In: AAAI '06, pp.1419-1424.) Nastase, Vivi et al. (2010). WikiNet: A very large scale multi-lingual concept network. In: LREC 2010 Nastase, Vivi; Strube, Michael (2008). Decoding Wikipedia Categories for Knowledge Acquisition. In: AAAI '08, pp.1219-1224.
responsiblesCrabbé