Sciweavers

CLEF
2006
Springer

Statistical vs. Rule-Based Stemming for Monolingual French Retrieval

13 years 8 months ago
Statistical vs. Rule-Based Stemming for Monolingual French Retrieval
This paper describes our approach to the 2006 Adhoc Monolingual Information Retrieval run for French. The goal of our experiment was to compare the performance of a proposed statistical stemmer with that of a rule-based stemmer, specifically the French version of Porter's stemmer. The statistical stemming approach is based on lexicon clustering, using a novel string distance measure. We submitted three official runs, besides a baseline run that uses no stemming. The results show that stemming significantly improves retrieval performance (as expected) by about 9-10%, and the performance of the statistical stemmer is comparable with that of the rule-based stemmer. Categories and Subject Descriptors H.3 [Information Storage and Retrieval]: H.3.1 Content Analysis and Indexing; H.3.3 Information Search and Retrieval; H.3.4 Systems and Software; H.3.7 Digital Libraries; H.2.3 [Database Managment]: Languages--Query Languages General Terms Performance, Experimentation Keywords statistica...
Prasenjit Majumder, Mandar Mitra, Kalyankumar Datt
Added 20 Aug 2010
Updated 20 Aug 2010
Type Conference
Year 2006
Where CLEF
Authors Prasenjit Majumder, Mandar Mitra, Kalyankumar Datta
Comments (0)