Sciweavers

ACL
2009

Modeling Morphologically Rich Languages Using Split Words and Unstructured Dependencies

13 years 1 months ago
Modeling Morphologically Rich Languages Using Split Words and Unstructured Dependencies
We experiment with splitting words into their stem and suffix components for modeling morphologically rich languages. We show that using a morphological analyzer and disambiguator results in a significant perplexity reduction in Turkish. We present flexible n-gram models, FlexGrams, which assume that the n-1 tokens that determine the probability of a given token can be chosen anywhere in the sentence rather than the preceding n-1 positions. Our final model achieves 27% perplexity reduction compared to the standard n-gram model.
Deniz Yuret, Ergun Biçici
Added 16 Feb 2011
Updated 16 Feb 2011
Type Journal
Year 2009
Where ACL
Authors Deniz Yuret, Ergun Biçici
Comments (0)