Sciweavers

ACL
2000

A Morphologically Sensitive Clustering Algorithm for Identifying Arabic Roots

13 years 5 months ago
A Morphologically Sensitive Clustering Algorithm for Identifying Arabic Roots
We present a clustering algorithm for Arabic words sharing the same root. Root based clusters can substitute dictionaries in indexing for IR. Modifying Adamson and Boreham (1974), our Two-stage algorithm applies light stemming before calculating word pair similarity coefficients using techniques sensitive to Arabic morphology. Tests show a successful treatment of infixes and accurate clustering to up to 94.06% for unedited Arabic text samples, without the use of dictionaries.
Anne N. De Roeck, Waleed Al-Fares
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where ACL
Authors Anne N. De Roeck, Waleed Al-Fares
Comments (0)