Sciweavers

CLEF
2011
Springer

Adapting Statistical Language Identification Methods for Short Queries

12 years 4 months ago
Adapting Statistical Language Identification Methods for Short Queries
This paper describes the participation of UAIC team at the LogCLEF 2011 initiative, language identification task. Our approach is an aggregation of known methods for recognizing languages. Short texts are a real challenge in applying a language identification tool; so, our methods had to comply with it by resisting to noisy data as only one letter, only numbers, links, different symbols. We applied n-grams extraction with distance measurement computing and a learning algorithm. The results were satisfying on specific languages, considering that our system supports only a limited number of languages.
Alexandru-Lucian Gînsca, Emanuela Boros, Adr
Added 18 Dec 2011
Updated 18 Dec 2011
Type Journal
Year 2011
Where CLEF
Authors Alexandru-Lucian Gînsca, Emanuela Boros, Adrian Iftene
Comments (0)