We propose a novel approach, called Dynamic Fractional Resource Scheduling (DFRS), to share homogeneous cluster computing platforms among competing jobs. DFRS leverages virtual mac...
The goal of clustering is to identify distinct groups in a dataset. The basic idea of model-based clustering is to approximate the data density by a mixture model, typically a mix...
We revisit recently proposed algorithms for probabilistic clustering with pair-wise constraints between data points. We evaluate and compare existing techniques in terms of robust...
The Dirichlet compound multinomial (DCM) distribution, also called the multivariate Polya distribution, is a model for text documents that takes into account burstiness: the fact ...
In this paper we describe a new cluster model which is based on the concept of linear manifolds. The method identifies subsets of the data which are embedded in arbitrary oriented...