Analyzing the Effectiveness and Applicability of Co-training

11 years 9 months ago
Analyzing the Effectiveness and Applicability of Co-training
Recently there has been significant interest in supervised learning algorithms that combine labeled and unlabeled data for text learning tasks. The co-training setting [1] applies to datasets that have a natural separation of their features into two disjoint sets. We demonstrate that when learning from labeled and unlabeled data, algorithms explicitly leveraging a natural independent split of the features outperform algorithms that do not. When a natural split does not exist, co-training algorithms that manufacture a feature split may out-perform algorithms not using a split. These results help explain why co-training algorithms are both discriminative in nature and robust to the assumptions of their embedded classifiers. Categories and Subject Descriptors I.2.6 [Artificial Intelligence]: Learning; H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval— Information Filtering Keywords co-training, expectation-maximization, learning with labeled and unlabeled d...
Kamal Nigam, Rayid Ghani
Added 02 Aug 2010
Updated 02 Aug 2010
Type Conference
Year 2000
Where CIKM
Authors Kamal Nigam, Rayid Ghani
Comments (0)