The input to an algorithm that learns a binary classifier normally consists of two sets of examples, where one set consists of positive examples of the concept to be learned, and ...
This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated la...
Victor S. Sheng, Foster J. Provost, Panagiotis G. ...
Current hidden Markov acoustic modeling for large vocabulary continuous speech recognition (LVCSR) relies on the availability of abundant labeled transcriptions. Given that speech...
We introduce a Bayesian model, BayesANIL, that is capable of estimating uncertainties associated with the labeling process. Given a labeled or partially labeled training corpus of...
In this paper, we present an overview of generalized expectation criteria (GE), a simple, robust, scalable method for semi-supervised training using weakly-labeled data. GE fits m...