Sciweavers

42 search results - page 1 / 9
» A sampling-based framework for parallel data mining
Sort
View
PPOPP
2005
ACM
13 years 9 months ago
A sampling-based framework for parallel data mining
The goal of data mining algorithm is to discover useful information embedded in large databases. Frequent itemset mining and sequential pattern mining are two important data minin...
Shengnan Cong, Jiawei Han, Jay Hoeflinger, David A...
IPPS
2003
IEEE
13 years 9 months ago
A Compilation Framework for Distributed Memory Parallelization of Data Mining Algorithms
With the availability of large datasets in a variety of scientific and commercial domains, data mining has emerged as an important area within the last decade. Data mining techni...
Xiaogang Li, Ruoming Jin, Gagan Agrawal
SC
2005
ACM
13 years 9 months ago
PerfExplorer: A Performance Data Mining Framework For Large-Scale Parallel Computing
Parallel applications running on high-end computer systems manifest a complexity of performance phenomena. Tools to observe parallel performance attempt to capture these phenomena...
Kevin A. Huck, Allen D. Malony
CCGRID
2009
IEEE
13 years 11 months ago
Performance Issues in Parallelizing Data-Intensive Applications on a Multi-core Cluster
The deluge of available data for analysis demands the need to scale the performance of data mining implementations. With the current architectural trends, one of the major challen...
Vignesh T. Ravi, Gagan Agrawal
KDD
2009
ACM
198views Data Mining» more  KDD 2009»
14 years 4 months ago
Pervasive parallelism in data mining: dataflow solution to co-clustering large and sparse Netflix data
All Netflix Prize algorithms proposed so far are prohibitively costly for large-scale production systems. In this paper, we describe an efficient dataflow implementation of a coll...
Srivatsava Daruru, Nena M. Marin, Matt Walker, Joy...