Sciweavers

KDD
2004
ACM

Parallel computation of high dimensional robust correlation and covariance matrices

14 years 4 months ago
Parallel computation of high dimensional robust correlation and covariance matrices
The computation of covariance and correlation matrices are critical to many data mining applications and processes. Unfortunately the classical covariance and correlation matrices are very sensitive to outliers. Robust methods, such as QC and the Maronna method, have been proposed. However, existing algorithms for QC only give acceptable performance when the dimensionality of the matrix is in the hundreds; and the Maronna method is rarely used in practise because of its high computational cost. In this paper, we develop parallel algorithms for both QC and the Maronna method. We evaluate these parallel algorithms using a real data set of the gene expression of over 6,000 genes, giving rise to a matrix of over 18 million entries. In our experimental evaluation, we explore scalability in dimensionality and in the number of processors, and the trade-offs between accuracy and computational efficiency. We also compare the parallel behaviours of the two methods. From a statistical standpoint...
James Chilson, Raymond T. Ng, Alan Wagner, Ruben H
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2004
Where KDD
Authors James Chilson, Raymond T. Ng, Alan Wagner, Ruben H. Zamar
Comments (0)