Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

9

SIGIR
2002
ACM

favoriteEmaildiscussreport

103views Information Technology» more SIGIR 2002»

Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis

13 years 4 months ago

Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis

Download ciir.cs.umass.edu

Arabic, a highly inflected language, requires good stemming for effective information retrieval, yet no standard approach to stemming has emerged. We developed several light stemmers based on heuristics and a statistical stemmer based on co-occurrence for Arabic retrieval. We compared the retrieval effectiveness of our stemmers and of a morphological analyzer on the TREC-2001 data. The best light stemmer was more effective for cross-language retrieval than a morphological stemmer which tried to find the root for each word. A repartitioning process consisting of vowel removal followed by clustering using co-occurrence analysis produced stem classes which were better than no stemming or very light stemming, but still inferior to good light stemming or morphological analysis. Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing

Leah S. Larkey, Lisa Ballesteros, Margaret E. Conn

Real-time Traffic

Effective Information Retrieval | Information Retrieval | Information Technology | Light Stemmer | SIGIR 2002 |

claim paper

Related Content

» Arabic Stemming Without A Root Dictionary

» Arabic Retrieval Revisited Morphological Hole Filling

» Unsupervised Learning of Arabic Stemming Using a Parallel Corpus

» Empirical studies in strategies for Arabic retrieval

» Stemming Approaches for East European Languages

» Light stemming approaches for the French Portuguese German and Hungarian languages

» A Study on the Use of Stemming for Monolingual AdHoc Portuguese Information Retrieval

» Statistical vs RuleBased Stemming for Monolingual French Retrieval

» Single ngram stemming

Post Info
More Details (n/a)

Added	23 Dec 2010
Updated	23 Dec 2010
Type	Journal
Year	2002
Where	SIGIR
Authors	Leah S. Larkey, Lisa Ballesteros, Margaret E. Connell

Comments (0)