Sciweavers

SIGIR
2009
ACM

Experiments in CLIR using fuzzy string search based on surface similarity

13 years 11 months ago
Experiments in CLIR using fuzzy string search based on surface similarity
Cross Language Information Retrieval (CLIR) between languages of the same origin is an interesting topic of research. The similarity of the writing systems used for these languages can be used effectively to not only improve CLIR, but to overcome the problems of textual variations, textual errors, and even the lack of linguistic resources like stemmers to an extent. We have conducted CLIR experiments between three languages which use writing systems (scripts) of Brahmi-origin, namely Hindi, Bengali and Marathi. We found significant improvements for all the six language pairs using a method for fuzzy text search based on Surface Similarity. In this paper we report these results and compare them with a baseline CLIR system and a CLIR system that uses Scaled Edit Distance (SED) for fuzzy string matching. Categories and Subject Descriptors H.3.2 [Information Storage and Retrieval]: Query formulation; H.3.1 [Content Analysis and Indexing]: Linguistic processing General Terms Algorithms, ...
Sethuramalingam Subramaniam, Anil Kumar Singh, Pra
Added 28 May 2010
Updated 28 May 2010
Type Conference
Year 2009
Where SIGIR
Authors Sethuramalingam Subramaniam, Anil Kumar Singh, Pradeep Dasigi, Vasudeva Varma
Comments (0)