Sciweavers

179 search results - page 2 / 36
» Parallel k h-Means Clustering for Large Data Sets
Sort
View
ICPP
2000
IEEE
13 years 9 months ago
A Scalable Parallel Subspace Clustering Algorithm for Massive Data Sets
Clustering is a data mining problem which finds dense regions in a sparse multi-dimensional data set. The attribute values and ranges of these regions characterize the clusters. ...
Harsha S. Nagesh, Sanjay Goil, Alok N. Choudhary
BMCBI
2010
121views more  BMCBI 2010»
13 years 2 months ago
A grammar-based distance metric enables fast and accurate clustering of large sets of 16S sequences
Background: We propose a sequence clustering algorithm and compare the partition quality and execution time of the proposed algorithm with those of a popular existing algorithm. T...
David J. Russell, Samuel F. Way, Andrew K. Benson,...
PR
2008
88views more  PR 2008»
13 years 5 months ago
Modified global k
Clustering in gene expression data sets is a challenging problem. Different algorithms for clustering of genes have been proposed. However due to the large number of genes only a ...
Adil M. Bagirov
APPROX
2008
Springer
101views Algorithms» more  APPROX 2008»
13 years 7 months ago
Streaming Algorithms for k-Center Clustering with Outliers and with Anonymity
Clustering is a common problem in the analysis of large data sets. Streaming algorithms, which make a single pass over the data set using small working memory and produce a cluster...
Richard Matthew McCutchen, Samir Khuller
PVLDB
2008
182views more  PVLDB 2008»
13 years 4 months ago
SCOPE: easy and efficient parallel processing of massive data sets
Companies providing cloud-scale services have an increasing need to store and analyze massive data sets such as search logs and click streams. For cost and performance reasons, pr...
Ronnie Chaiken, Bob Jenkins, Per-Åke Larson,...