Sciweavers

IDEAL
2000
Springer

Quantization of Continuous Input Variables for Binary Classification

14 years 1 months ago
Quantization of Continuous Input Variables for Binary Classification
Quantization of continuous variables is important in data analysis, especially for some model classes such as Bayesian networks and decision trees, which use discrete variables. Often, the discretization is based on the distribution of the input variables only whereas additional information, for example in form of class membership is frequently present and could be used to improve the quality of the results. In this paper, quantization methods based on equal width interval, maximum entropy, maximum mutual information and the novel approach based on maximum mutual information combined with entropy are considered. The two former approaches do not take the class membership into account whereas the two latter approaches do. The relative merits of each method are compared in an empirical setting, where results are shown for two data sets in a direct marketing problem, and the quality of quantization is measured by mutual information and the performance of Naive Bayes and C5 decision tree cl...
Michal Skubacz, Jaakko Hollmén
Added 25 Aug 2010
Updated 25 Aug 2010
Type Conference
Year 2000
Where IDEAL
Authors Michal Skubacz, Jaakko Hollmén
Comments (0)