Arabic, a highly inflected language, requires good stemming for effective information retrieval, yet no standard approach to stemming has emerged. We developed several light stemm...
Leah S. Larkey, Lisa Ballesteros, Margaret E. Conn...
This paper describes a project tagging a spontaneous speech corpus with morphological information such as word segmentation and parts-ofspeech. We use a morphological analysis sys...
Abstract. This paper describes the application of the perceptron algorithm to the morphological disambiguation of Turkish text. Turkish has a productive derivational morphology. Du...
In statistical language modeling, one technique to reduce the problematic effects of data sparsity is to partition the vocabulary into equivalence classes. In this paper we invest...
This paper describes our work on building Part-of-Speech (POS) tagger for Bengali. We have use Hidden Markov Model (HMM) and Maximum Entropy (ME) based stochastic taggers. Bengali...