We propose the hierarchical Dirichlet process (HDP), a nonparametric Bayesian model for clustering problems involving multiple groups of data. Each group of data is modeled with a...
Yee Whye Teh, Michael I. Jordan, Matthew J. Beal, ...
Background: Sequencing of environmental DNA (often called metagenomics) has shown tremendous potential to uncover the vast number of unknown microbes that cannot be cultured and s...
We describe a generic programming model to design collective communications on SMP clusters. The programming model utilizes shared memory for collective communications and overlap...
We examine the learning-curve sampling method, an approach for applying machinelearning algorithms to large data sets. The approach is based on the observation that the computatio...
The performance of K-means and Gaussian mixture model (GMM) clustering depends on the initial guess of partitions. Typically, clus∗ corresponding author 1