Cross Language Text Classification by Model Translation and Semi-Supervised Learning

15 years 5 months ago

Download www.aclweb.org

In this paper, we introduce a method that automatically builds text classifiers in a new language by training on already labeled data in another language. Our method transfers the classification knowledge across languages by translating the model features and by using an Expectation Maximization (EM) algorithm that naturally takes into account the ambiguity associated with the translation of a word. We further exploit the readily available unlabeled data in the target language via semisupervised learning, and adapt the translated model to better fit the data distribution of the target language.

Lei Shi, Rada Mihalcea, Mingjun Tian

Real-time Traffic

Available Unlabeled Data | EMNLP 2010 | Expectation Maximization | Natural Language Processing | Target Language |

claim paper

» CrossLanguage Frame Semantics Transfer in Bilingual Corpora

» Mining Bilingual Data from the Web with Adaptively Learnt Patterns

» Applying a Dynamic Bayesian Network Framework to Transliteration Identification

» Can chinese web pages be classified with english data source

Post Info
More Details (n/a)

Added	11 Feb 2011
Updated	11 Feb 2011
Type	Journal
Year	2010
Where	EMNLP
Authors	Lei Shi, Rada Mihalcea, Mingjun Tian

Comments (0)

Sciweavers

Cross Language Text Classification by Model Translation and Semi-Supervised Learning

Available Unlabeled Data | EMNLP 2010 | Expectation Maximization | Natural Language Processing | Target Language |

Explore & Download

Productivity Tools

Sciweavers