Sciweavers

ICDE
2010
IEEE

Finding Clusters in subspaces of very large, multi-dimensional datasets

13 years 3 months ago
Finding Clusters in subspaces of very large, multi-dimensional datasets
Abstract— We propose the Multi-resolution Correlation Cluster detection (MrCC), a novel, scalable method to detect correlation clusters able to analyze dimensional data in the range of around 5 to 30 axes. Existing methods typically exhibit superlinear behavior in terms of space or execution time. MrCC employs a novel data structure based on multi-resolution and gains over previous approaches in: (a) it finds clusters that stand out in the data in a statistical sense; (b) it is linear on running time and memory usage regarding number of data points and dimensionality of subspaces where clusters exist; (c) it is linear in memory usage and quasi-linear in running time regarding space dimensionality; and (d) it is accurate, deterministic, robust to noise, does not require stating the number of clusters as input parameter, does not perform distance calculation and is able to detect clusters in subspaces generated by original axes or linear combinations of original axes, including space ...
Robson Leonardo Ferreira Cordeiro, Agma J. M. Trai
Added 26 Jan 2011
Updated 26 Jan 2011
Type Journal
Year 2010
Where ICDE
Authors Robson Leonardo Ferreira Cordeiro, Agma J. M. Traina, Christos Faloutsos, Caetano Traina Jr.
Comments (0)