Diagnosing extrapolation: tree-based density estimation

11 years 6 months ago
Diagnosing extrapolation: tree-based density estimation
There has historically been very little concern with extrapolation in Machine Learning, yet extrapolation can be critical to diagnose. Predictor functions are almost always learned on a set of highly correlated data comprising a very small segment of predictor space. Moreover, flexible predictors, by their very nature, are not controlled at points of extrapolation. This becomes a problem for diagnostic tools that require evaluation on a product distribution. It is also an issue when we are trying to optimize a response over some variable in the input space. Finally, it can be a problem in non-static systems in which the underlying predictor distribution gradually drifts with time or when typographical errors misrecord the values of some predictors. We present a diagnosis for extrapolation as a statistical test for a point originating from the data distribution as opposed to a null hypothesis uniform distribution. This allows us to employ general classification methods for estimating s...
Giles Hooker
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2004
Where KDD
Authors Giles Hooker
Comments (0)