Sciweavers

GFKL
2004
Springer

Density Estimation and Visualization for Data Containing Clusters of Unknown Structure

13 years 9 months ago
Density Estimation and Visualization for Data Containing Clusters of Unknown Structure
Abstract. A method for measuring the density of data sets that contain an unknown number of clusters of unknown sizes is proposed. This method, called Pareto Density Estimation (PDE), uses hyper spheres to estimate data density. The radius of the hyper spheres is derived from information optimal sets. PDE leads to a tool for the visualization of probability density distributions of variables (PDEplot). For Gaussian mixture data this is an optimal empirical density estimation. A new kind of visualization of the density structure of high dimensional data set, the P-Matrix is defined. The P-Matrix for a 79- dimensional data set from DNA array analysis is shown. The P-Matrix reveals local concentrations of data points representing similar gene expressions. The P-Matrix is also a very effective tool in the detection of clusters and outliers in data sets.
Alfred Ultsch
Added 01 Jul 2010
Updated 01 Jul 2010
Type Conference
Year 2004
Where GFKL
Authors Alfred Ultsch
Comments (0)