Decision Trees for Uncertain Data

9 years 11 months ago
Decision Trees for Uncertain Data
Traditional decision tree classifiers work with data whose values are known and precise. We extend such classifiers to handle data with uncertain information, which originates from measurement/quantisation errors, data staleness, multiple repeated measurements, etc. The value uncertainty is represented by multiple values forming a probability distribution function (pdf). We discover that the accuracy of a decision tree classifier can be much improved if the whole pdf, rather than a simple statistic, is taken into account. We extend classical decision tree building algorithms to handle data tuples with uncertain values. Since processing pdf's is computationally more costly, we propose a series of pruning techniques that can greatly improve the efficiency of the construction of decision trees.
Smith Tsang, Ben Kao, Kevin Y. Yip, Wai-Shing Ho,
Added 20 Oct 2009
Updated 20 Oct 2009
Type Conference
Year 2009
Where ICDE
Authors Smith Tsang, Ben Kao, Kevin Y. Yip, Wai-Shing Ho, Sau Dan Lee
Comments (0)