Sciweavers

BMCBI
2007

Robust clustering in high dimensional data using statistical depths

13 years 5 months ago
Robust clustering in high dimensional data using statistical depths
Background: Mean-based clustering algorithms such as bisecting k-means generally lack robustness. Although componentwise median is a more robust alternative, it can be a poor center representative for high dimensional data. We need a new algorithm that is robust and works well in high dimensional data sets e.g. gene expression data. Results: Here we propose a new robust divisive clustering algorithm, the bisecting k-spatialMedian, based on the statistical spatial depth. A new subcluster selection rule, Relative Average Depth, is also introduced. We demonstrate that the proposed clustering algorithm outperforms the componentwise-median-based bisecting k-median algorithm for high dimension and low sample size (HDLSS) data via applications of the algorithms on two real HDLSS gene expression data sets. When further applied on noisy real data sets, the proposed algorithm compares favorably in terms of robustness with the componentwise-median-based bisecting k-median algorithm. Conclusion: ...
Yuanyuan Ding, Xin Dang, Hanxiang Peng, Dawn Wilki
Added 12 Dec 2010
Updated 12 Dec 2010
Type Journal
Year 2007
Where BMCBI
Authors Yuanyuan Ding, Xin Dang, Hanxiang Peng, Dawn Wilkins
Comments (0)