Sciweavers

ECIR
2009
Springer

Word Particles Applied to Information Retrieval

14 years 1 months ago
Word Particles Applied to Information Retrieval
Document retrieval systems conventionally use words as the basic unit of representation, a natural choice since words are primary carriers of semantic information. In this paper we propose the use of a different, phonetically defined unit of representation that we call ¨particles¨. Particles are phonetic sequences that do not possess meaning. Both documents and queries are converted from their standard word-based form into sequences of particles. Indexing and retrieval is performed with particles. Experiments show that this scheme is capable of achieving retrieval performance that is comparable to that from words when the text in the documents and queries are clean, and can result in significantly improved retrieval when they are noisy. European Conference on information retrieval This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research ...
Evandro B. Gouvêa, Bhiksha Raj
Added 08 Mar 2010
Updated 08 Mar 2010
Type Conference
Year 2009
Where ECIR
Authors Evandro B. Gouvêa, Bhiksha Raj
Comments (0)