Based on simple methods such as observing word and part of speech tag co-occurrence and clustering, we generate syntactic parses of sentences in an entirely unsupervised and self-...
An adaptive semi-supervised ensemble method, ASSEMBLE, is proposed that constructs classification ensembles based on both labeled and unlabeled data. ASSEMBLE alternates between a...
We present two new methods for obtaining generalization error bounds in a semi-supervised setting. Both methods are based on approximating the disagreement probability of pairs of ...
Mining bilingual data (including bilingual sentences and terms1 ) from the Web can benefit many NLP applications, such as machine translation and cross language information retrie...
Long Jiang, Shiquan Yang, Ming Zhou, Xiaohua Liu, ...
Robust statistical learning based web spam detection system often requires large amounts of labeled training data. However, labeled samples are more difficult, expensive and time ...