Unsupervised Learning with Permuted Data

11 years 3 months ago
Unsupervised Learning with Permuted Data
We consider the problem of unsupervised learning from a matrix of data vectors where in each row the observed values are randomly permuted in an unknown fashion. Such problems arise naturally in areas such as computer vision and text modeling where measurements need not be in correspondence with the correct features. We provide a general theoretical characterization of the difficulty of "unscrambling" the values of the rows for such problems and relate the optimal error rate to the well-known concept of the Bayes classification error rate. For known parametric distributions we derive closed-form expressions for the optimal error rate that provide insight into what makes this problem difficult in practice. Finally, we show how the Expectation-Maximization procedure can be used to simultaneously estimate both a probabilistic model for the features as well as a distribution over the correspondence of the row values.
Sergey Kirshner, Sridevi Parise, Padhraic Smyth
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2003
Where ICML
Authors Sergey Kirshner, Sridevi Parise, Padhraic Smyth
Comments (0)