Deduplication is a key operation in integrating data from multiple sources. The main challenge in this task is designing a function that can resolve when a pair of records refer t...
Ranking is at the heart of many information retrieval applications. Unlike standard regression or classification in which we predict outputs independently, in ranking we are inter...
Empirical risk minimization offers well-known learning guarantees when training and test data come from the same domain. In the real world, though, we often wish to adapt a classi...
John Blitzer, Koby Crammer, Alex Kulesza, Fernando...
—A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as...
Gustavo Carneiro, Antoni B. Chan, Pedro J. Moreno,...
Background: Independently derived expression profiles of the same biological condition often have few genes in common. In this study, we created populations of expression profiles...
Michael Gormley, William Dampier, Adam Ertel, Bilg...