A PAC-Bayesian Analysis of Graph Clustering and Pairwise Clustering

8 years 3 months ago
A PAC-Bayesian Analysis of Graph Clustering and Pairwise Clustering
We formulate weighted graph clustering as a prediction problem1 : given a subset of edge weights we analyze the ability of graph clustering to predict the remaining edge weights. This formulation enables practical and theoretical comparison of different approaches to graph clustering as well as comparison of graph clustering with other possible ways to model the graph. We adapt the PAC-Bayesian analysis of co-clustering (Seldin and Tishby, 2008; Seldin, 2009) to derive a PACBayesian generalization bound for graph clustering. The bound shows that graph clustering should optimize a trade-off between empirical data fit and the mutual information that clusters preserve on the graph nodes. A similar trade-off derived from information-theoretic considerations was already shown to produce state-ofthe-art results in practice (Slonim et al., 2005; Yom-Tov and Slonim, 2009). This paper supports the empirical evidence by providing a better theoretical foundation, suggesting formal generalization...
Yevgeny Seldin
Added 09 Dec 2010
Updated 09 Dec 2010
Type Journal
Year 2010
Where CORR
Authors Yevgeny Seldin
Comments (0)