Sciweavers

EMNLP
2009

Domain adaptive bootstrapping for named entity recognition

13 years 2 months ago
Domain adaptive bootstrapping for named entity recognition
Bootstrapping is the process of improving the performance of a trained classifier by iteratively adding data that is labeled by the classifier itself to the training set, and retraining the classifier. It is often used in situations where labeled training data is scarce but unlabeled data is abundant. In this paper, we consider the problem of domain adaptation: the situation where training data may not be scarce, but belongs to a different domain from the target application domain. As the distribution of unlabeled data is different from the training data, standard bootstrapping often has difficulty selecting informative data to add to the training set. We propose an effective domain adaptive bootstrapping algorithm that selects unlabeled target domain data that are informative about the target domain and easy to automatically label correctly. We call these instances bridges, as they are used to bridge the source domain to the target domain. We show that the method outperforms supervis...
Dan Wu, Wee Sun Lee, Nan Ye, Hai Leong Chieu
Added 17 Feb 2011
Updated 17 Feb 2011
Type Journal
Year 2009
Where EMNLP
Authors Dan Wu, Wee Sun Lee, Nan Ye, Hai Leong Chieu
Comments (0)