Sciweavers

JPDC
2008

Middleware for data mining applications on clusters and grids

13 years 4 months ago
Middleware for data mining applications on clusters and grids
This paper gives an overview of two middleware systems that have been developed over the last 6 years to address the challenges involved in developing parallel and distributed implementations of data mining algorithms. FREERIDE (FRamework for Rapid Implementation of Data mining Engines) focuses on data mining in a cluster environment. FREERIDE is based on the observation that parallel versions of several well-known data mining techniques share a relatively similar structure, and can be parallelized by dividing the data instances (or records or transactions) among the nodes. The computation on each node involves reading the data instances in an arbitrary order, processing each data instance, and performing a local reduction. The reduction involves only commutative and associative operations, which means the result is independent of the order in which the data instances are processed. After the local reduction on each node, a global reduction is performed. This similarity in the structu...
Leonid Glimcher, Ruoming Jin, Gagan Agrawal
Added 13 Dec 2010
Updated 13 Dec 2010
Type Journal
Year 2008
Where JPDC
Authors Leonid Glimcher, Ruoming Jin, Gagan Agrawal
Comments (0)