Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution

16 years 6 months ago

Download www.public.asu.edu

Feature selection, as a preprocessing step to machine learning, has been effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection methods with respect to efficiency and effectiveness. In this work, we introduce a novel concept, predominant correlation, and propose a fast filter method which can identify relevant features as well as redundancy among relevant features without pairwise correlation analysis. The efficiency and effectiveness of our method is demonstrated through extensive comparisons with other methods using real-world data of high dimensionality.

Lei Yu, Huan Liu

Real-time Traffic

Fast Filter Method | Feature Selection Methods | ICML 2003 | Machine Learning | Pairwise Correlation Analysis |

claim paper

» Feature Selection Based on Fisher Ratio and Mutual Information Analyses for Robust Brain C...

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2003
Where	ICML
Authors	Lei Yu, Huan Liu

Comments (0)

Sciweavers

Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution

Fast Filter Method | Feature Selection Methods | ICML 2003 | Machine Learning | Pairwise Correlation Analysis |

Explore & Download

Productivity Tools

Sciweavers