In this paper we report our work on building a POS tagger for a morphologically rich language- Hindi. The theme of the research is to vindicate the stand that- if morphology is st...
: This article describes a multilayer model-based approach for text compression. It uses linguistic information to develop a multilayer decomposition model of the text in order to ...
Abstract. Automatic lemmatisation is a core application for many language processing tasks. In inflectionally rich languages, such as Slovene, assigning the correct lemma to each ...
We describe the Arabic broadcast transcription system elded by IBM in the GALE Phase 4 machine translation evaluation. Key advances over our Phase 3.5 system include improvements ...
Brian Kingsbury, Hagen Soltau, George Saon, Stephe...
Without any doubt corpora are vital tools for linguistic studies and solution for applied tasks. Although corpora opportunities are very useful, there is a need of another kind of...