Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

90

KDD
2003
ACM

favoriteEmaildiscussreport

135views Data Mining» more KDD 2003»

Efficiently handling feature redundancy in high-dimensional data

16 years 21 days ago

Efficiently handling feature redundancy in high-dimensional data

Download www.public.asu.edu

High-dimensional data poses a severe challenge for data mining. Feature selection is a frequently used technique in preprocessing high-dimensional data for successful data mining. Traditionally, feature selection is focused on removing irrelevant features. However, for high-dimensional data, removing redundant features is equally critical. In this paper, we provide a study of feature redundancy in high-dimensional data and propose a novel correlation-based approach to feature selection within the filter model. The extensive empirical study using real-world data shows that the proposed approach is efficient and effective in removing redundant and irrelevant features. Categories and Subject Descriptors H.2.8 [Database Management]: Database Applications-data mining; I.2.6 [Artificial Intelligence]: Learning; I.5.2 [Pattern Recognition]: Design Methodology --feature evaluation and selection Keywords Feature selection, redundancy, high-dimensional data

Lei Yu, Huan Liu

Real-time Traffic

Data Mining | Feature Selection | KDD 2003 | Keywords Feature Selection | Successful Data Mining |

claim paper

Related Content

» Feature Selection for HighDimensional Data A Fast CorrelationBased Filter Solution

» Bayesian regression with input noise for high dimensional data

» Feature Selection for Classifying HighDimensional Numerical Data

» Clustering of HighDimensional Gene Expression Data with Feature Filtering Methods and Diff...

» Learning Sparse SVM for Feature Selection on Very High Dimensional Datasets

» Feature Extraction for Outlier Detection in HighDimensional Spaces

» Scalable Clustering for Large HighDimensional Data Based on Data Summarization

» HighDimensional Feature Matching Employing the Concept of Meaningful Nearest Neighbors

» Finding Clusters of Different Sizes Shapes and Densities in Noisy High Dimensional Data

Post Info
More Details (n/a)

Added	30 Nov 2009
Updated	30 Nov 2009
Type	Conference
Year	2003
Where	KDD
Authors	Lei Yu, Huan Liu

Comments (0)