Multivariate Discretization by Recursive Supervised Bipartition of Graph

14 years 1 months ago
Multivariate Discretization by Recursive Supervised Bipartition of Graph
Abstract. In supervised learning, discretization of the continuous explanatory attributes enhances the accuracy of decision tree induction algorithms and naive Bayes classifier. Many discretization methods have been developped, leading to precise and comprehensible evaluations of the amount of information contained in one single attribute with respect to the target one. In this paper, we discuss the multivariate notion of neighborhood, extending the univariate notion of interval. We propose an evaluation criterion of bipartitions, which is based on the Minimum Description Length (MDL) principle [1], and apply it recursively. The resulting discretization method is thus able to exploit correlations between continuous attributes. Its accuracy and robustness are evaluated on real and synthetic data sets. 1 Supervised Partitioning Problems In supervised learning, many inductive algorithms are known to produce better models by discretizing continuous attributes. For example, the naive Bayes...
Sylvain Ferrandiz, Marc Boullé
Added 28 Jun 2010
Updated 28 Jun 2010
Type Conference
Year 2005
Where MLDM
Authors Sylvain Ferrandiz, Marc Boullé
Comments (0)