Sciweavers

ICML
2003
IEEE

Learning on the Test Data: Leveraging Unseen Features

14 years 5 months ago
Learning on the Test Data: Leveraging Unseen Features
This paper addresses the problem of classification in situations where the data distribution is not homogeneous: Data instances might come from different locations or times, and therefore are sampled from related but different distributions. In particular, features may appear in some parts of the data that are rarely or never seen in others. In most situations with nonhomogeneous data, the training data is not representative of the distribution under which the classifier must operate. We propose a method, based on probabilistic graphical models, for utilizing unseen features during classification. Our method introduces, for each such unseen feature, a continuous hidden variable describing its influence on the class -- whether it tends to be associated with some label. We then use probabilistic inference over the test data to infer a distribution over the value of this hidden variable. Intuitively, we "learn" the role of this unseen feature from the test set, generalizing fro...
Benjamin Taskar, Ming Fai Wong, Daphne Koller
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2003
Where ICML
Authors Benjamin Taskar, Ming Fai Wong, Daphne Koller
Comments (0)