Sciweavers

PKDD
2007
Springer

Improved Algorithms for Univariate Discretization of Continuous Features

13 years 10 months ago
Improved Algorithms for Univariate Discretization of Continuous Features
In discretization of a continuous variable its numerical value range is divided into a few intervals that are used in classification. For example, Na¨ıve Bayes can benefit from this processing. A commonlyused supervised discretization method is Fayyad and Irani’s recursive entropy-based splitting of a value range. The technique uses mdl as a model selection criterion to decide whether to accept the proposed split. We argue that theoretically the method is not always close to ideal for this application. Empirical experiments support our finding. We give a statistical rule that does not use the ad-hoc rule of Fayyad and Irani’s approach to increase its performance. This rule, though, is quite time consuming to compute. We also demonstrate that a very simple Bayesian method performs better than mdl as a model selection criterion.
Jussi Kujala, Tapio Elomaa
Added 09 Jun 2010
Updated 09 Jun 2010
Type Conference
Year 2007
Where PKDD
Authors Jussi Kujala, Tapio Elomaa
Comments (0)