A Statistical Corpus-Based Term Extractor

9 years 2 months ago
A Statistical Corpus-Based Term Extractor
Abstract. Term extraction is an important problem in natural language processing. In this paper, we propose a language independent statistical corpus-based term extraction algorithm. In previous approaches, evaluation has been subjective, at best relying on a lexicographer’s judgement. We evaluate the quality of our term extractor by assessing its predictiveness on an unseen corpus using perplexity. Second, we evaluate the precision and recall of our extractor by comparing the Chinese words in a segmented corpus with the words extracted by our system.
Patrick Pantel, Dekang Lin
Added 28 Jul 2010
Updated 28 Jul 2010
Type Conference
Year 2001
Where AI
Authors Patrick Pantel, Dekang Lin
Comments (0)