Abstract. In many supervised learning tasks it can be costly or infeasible to obtain objective, reliable labels. We may, however, be able to obtain a large number of subjective, po...
We propose a non-linear Canonical Correlation Analysis (CCA) method which works by coordinating or aligning mixtures of linear models. In the same way that CCA extends the idea of...
This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...
Multiple Species Weighted Voting (MSWV) is a genetics-based machine learning (GBML) system with relatively few parameters that combines N two-class classifiers into an N -class cla...
We demonstrate the effectiveness of multilingual learning for unsupervised part-of-speech tagging. The key hypothesis of multilingual learning is that by combining cues from multi...
Benjamin Snyder, Tahira Naseem, Jacob Eisenstein, ...