Sciweavers

KDD
2008
ACM

Data mining using high performance data clouds: experimental studies using sector and sphere

14 years 5 months ago
Data mining using high performance data clouds: experimental studies using sector and sphere
We describe the design and implementation of a high performance cloud that we have used to archive, analyze and mine large distributed data sets. By a cloud, we mean an infrastructure that provides resources and/or services over the Internet. A storage cloud provides storage services, while a compute cloud provides compute services. We describe the design of the Sector storage cloud and how it provides the storage services required by the Sphere compute cloud. We also describe the programming paradigm supported by the Sphere compute cloud. Sector and Sphere are designed for analyzing large data sets using computer clusters connected with wide area high performance networks (for example, 10+ Gb/s). We describe a distributed data mining application that we have developed using Sector and Sphere. Finally, we describe some experimental studies comparing Sector/Sphere to Hadoop. Categories and Subject Descriptors: H.2.8 [Database Management]: Data mining, C.2.4 [Computer - Communications N...
Robert L. Grossman, Yunhong Gu
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2008
Where KDD
Authors Robert L. Grossman, Yunhong Gu
Comments (0)