CMP: A Fast Decision Tree Classifier Using Multivariate Predictions

9 years 7 months ago
CMP: A Fast Decision Tree Classifier Using Multivariate Predictions
Most decision tree classifiers are designed to keep class histograms for single attributes, and to select a particular attribute for the next split using said histograms. In this paper, we propose a technique where, by keeping histograms on attribute pairs, we achieve (i) a significant speed-up over traditional classifiers based on single attribute splitting, and (ii) the ability of building classifiers that use linear combinations of values from non-categorical attribute pairs as split criterion. Indeed, by keeping two-dimensional histograms, CMP can often predict the best successive split, in addition to computing the current one; therefore, CMP is normally able to grow more than one level of a decision tree for each data scan. CMP's performance improvements are also due to techniques whereby non-categorical attributes are discretized without loss in classification accuracy; in fact, we introduce simple techniques, whereby classification errors caused by discretization at one s...
Haixun Wang, Carlo Zaniolo
Added 01 Nov 2009
Updated 01 Nov 2009
Type Conference
Year 2000
Where ICDE
Authors Haixun Wang, Carlo Zaniolo
Comments (0)