This paper presents a general framework for adapting any generative (model-based) clustering algorithm to provide balanced solutions, i.e., clusters of comparable sizes. Partition...
This paper presents a simple but powerful extension of the maximum margin clustering (MMC) algorithm that optimizes multivariate performance measure specifically defined for clust...
Clustering is a basic task in a variety of machine learning applications. Partitioning a set of input vectors into compact, wellseparated subsets can be severely affected by the p...
Pedro A. Forero, Vassilis Kekatos, Georgios B. Gia...
We present a transparent, system-level checkpointing solution for master-worker parallelism that automatically adapts, upon restart, to the number of processor nodes available. Th...
The deluge of huge data sets such as those provided by
sensor networks, online transactions, and the web provide
exciting opportunities for data analysis. The scale of the
data ...