Sciweavers

CICLING
2007
Springer

A Competitive Term Selection Method for Information Retrieval

13 years 10 months ago
A Competitive Term Selection Method for Information Retrieval
Term selection process is a very necessary component for most natural language processing tasks. Although different unsupervised techniques have been proposed, the best results are obtained with a high computational cost, for instance, those based on the use of entropy. The aim of this paper is to propose an unsupervised term selection technique based on the use of a bigram-enriched version of the transition point. Our approach reduces the corpus vocabulary size by using the transition point technique and, thereafter, it expands the reduced corpus with bigrams obtained from the same corpus, i.e., without external knowledge sources. This approach provides a considerable dimensionality reduction of the TREC-5 collection and, also has shown to improve precision for some entropy-based methods.
Franco Rojas López, Héctor Jim&eacut
Added 07 Jun 2010
Updated 07 Jun 2010
Type Conference
Year 2007
Where CICLING
Authors Franco Rojas López, Héctor Jiménez-Salazar, David Pinto
Comments (0)