Sciweavers

EMNLP
2010

Cross Language Text Classification by Model Translation and Semi-Supervised Learning

13 years 2 months ago
Cross Language Text Classification by Model Translation and Semi-Supervised Learning
In this paper, we introduce a method that automatically builds text classifiers in a new language by training on already labeled data in another language. Our method transfers the classification knowledge across languages by translating the model features and by using an Expectation Maximization (EM) algorithm that naturally takes into account the ambiguity associated with the translation of a word. We further exploit the readily available unlabeled data in the target language via semisupervised learning, and adapt the translated model to better fit the data distribution of the target language.
Lei Shi, Rada Mihalcea, Mingjun Tian
Added 11 Feb 2011
Updated 11 Feb 2011
Type Journal
Year 2010
Where EMNLP
Authors Lei Shi, Rada Mihalcea, Mingjun Tian
Comments (0)