The min-sum k-clustering problem is to partition a metric space (P, d) into k clusters C1, . . . , Ck ⊆ P such that k i=1 p,q∈Ci d(p, q) is minimized. We show the first effi...
Background: Hierarchical clustering is a widely applied tool in the analysis of microarray gene expression data. The assessment of cluster stability is a major challenge in cluste...
A cross-validation error estimator is obtained by repeatedly leaving out some data points, deriving classifiers on the remaining points, computing errors for these classifiers on ...
Association rules mining is a frequently used technique which finds interesting association and correlation relationships among large set of data items which occur frequently toge...
Most existing semi-supervised learning methods are based on the smoothness assumption that data points in the same high density region should have the same label. This assumption, ...