The composition of the example set has a major impact on the quality of neural learning. The popular approach is focused on extensive preprocessing to bridge the representation ga...
Rare category detection is an open challenge for active learning, especially in the de-novo case (no labeled examples), but of significant practical importance for data mining - ...
Solving stochastic optimization problems under partial observability, where one needs to adaptively make decisions with uncertain outcomes, is a fundamental but notoriously diffic...
Deduplication, a key operation in integrating data from multiple sources, is a time-consuming, labor-intensive and domainspecific operation. We present our design of alias that us...
For many types of machine learning algorithms, one can compute the statistically optimal" way to select training data. In this paper, we review how optimal data selection tec...
David A. Cohn, Zoubin Ghahramani, Michael I. Jorda...