Bootstrapping Language Neutral Term Extraction

13 years 7 months ago
Bootstrapping Language Neutral Term Extraction
A variety of methods exist for extracting terms and relations between terms from a corpus, each of them having strengths and weaknesses. Rather than just using the joint results, we apply different extraction methods in a way that the results of one method are input to another. This gives us the leverage to find terms and relations that otherwise would not be found. Our goal is to create a semantic model of a domain. To that end, we aim to find the complete terminology of the domain, consisting of terms and relations such as hyponymy and meronymy, and connected to generic wordnets and ontologies. Terms are ranked by domain-relevance only as a final step, after terminology extraction is completed. Because term relations are a large part of the semantics of a term, we estimate the relevance from its relation to other terms, in addition to occurrence and document frequencies. In the KYOTO project, we apply language-neutral terminology extraction from a parsed corpus for seven languages.
Wauter Bosma, Piek Vossen
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where LREC
Authors Wauter Bosma, Piek Vossen
Comments (0)