Sciweavers

NAACL
2010

Language identification of names with SVMs

13 years 2 months ago
Language identification of names with SVMs
The task of identifying the language of text or utterances has a number of applications in natural language processing. Language identification has traditionally been approached with character-level language models. However, the language model approach crucially depends on the length of the text in question. In this paper, we consider the problem of language identification of names. We show that an approach based on SVMs with n-gram counts as features performs much better than language models. We also experiment with applying the method to pre-process transliteration data for the training of separate models.
Aditya Bhargava, Grzegorz Kondrak
Added 14 Feb 2011
Updated 14 Feb 2011
Type Journal
Year 2010
Where NAACL
Authors Aditya Bhargava, Grzegorz Kondrak
Comments (0)