Sciweavers

KDD
2004
ACM

A generalized maximum entropy approach to bregman co-clustering and matrix approximation

14 years 4 months ago
A generalized maximum entropy approach to bregman co-clustering and matrix approximation
Co-clustering is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. Recently, an informationtheoretic co-clustering approach applicable to empirical joint probability distributions was proposed. In many situations, co-clustering of more general matrices is desired. In this paper, we present a substantially generalized co-clustering framework wherein any Bregman divergence can be used in the objective function, and various conditional expectation based constraints can be considered based on the statistics that need to be preserved. Analysis of the coclustering problem leads to the minimum Bregman information principle, which generalizes the maximum entropy principle, and yields an elegant meta algorithm that is guaranteed to achieve local optimality. Our methodology yields new algorithms and also encompasses several previously known clustering and co-clustering algorithms based on alternate minimization. Categ...
Arindam Banerjee, Inderjit S. Dhillon, Joydeep Gho
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2004
Where KDD
Authors Arindam Banerjee, Inderjit S. Dhillon, Joydeep Ghosh, Srujana Merugu, Dharmendra S. Modha
Comments (0)