Sciweavers

17390 search results - page 41 / 3478
» Distributed Data Clustering
Sort
View
IDA
2009
Springer
14 years 9 months ago
Context-Based Distance Learning for Categorical Data Clustering
Abstract. Clustering data described by categorical attributes is a challenging task in data mining applications. Unlike numerical attributes, it is difficult to define a distance b...
Dino Ienco, Ruggero G. Pensa, Rosa Meo
OSDI
2004
ACM
16 years 3 days ago
MapReduce: Simplified Data Processing on Large Clusters
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to ge...
Jeffrey Dean, Sanjay Ghemawat
JMLR
2010
175views more  JMLR 2010»
14 years 6 months ago
Hierarchical Convex NMF for Clustering Massive Data
We present an extension of convex-hull non-negative matrix factorization (CH-NMF) which was recently proposed as a large scale variant of convex non-negative matrix factorization ...
Kristian Kersting, Mirwaes Wahabzada, Christian Th...
ICDE
2006
IEEE
165views Database» more  ICDE 2006»
15 years 5 months ago
Privacy Preserving Clustering on Horizontally Partitioned Data
Data mining has been a popular research area for more than a decade due to its vast spectrum of applications. The power of data mining tools to extract hidden information that can...
Ali Inan, Yücel Saygin, Erkay Savas, Ay&ccedi...
IPPS
2007
IEEE
15 years 6 months ago
Towards A Better Understanding of Workload Dynamics on Data-Intensive Clusters and Grids
This paper presents a comprehensive statistical analysis of workloads collected on data-intensive clusters and Grids. The analysis is conducted at different levels, including Virt...
Hui Li, Lex Wolters