Forest trees for on-line data

10 years 11 months ago
Forest trees for on-line data
This paper presents an hybrid adaptive system for induction of forest of trees from data streams. The Ultra Fast Forest Tree system (UFFT) is an incremental algorithm, with constant time for processing each example, works online, and uses the Hoeffding bound to decide when to install a splitting test in a leaf leading to a decision node. Our system has been designed for continuous data. It uses analytical techniques to choose the splitting criteria, and the information gain to estimate the merit of each possible splitting-test. The number of examples required to evaluate the splitting criteria is sound, based on the Hoeffding bound. For multiclass problems,the algorithm builds a binary tree for each possible pair of classes, leading to a forest of trees. During the training phase the algorithm maintains a short term memory. Given a data stream, a fixed number of the most recent examples are maintained in a data-structure that supports constant time insertion and deletion. When a te...
João Gama, Pedro Medas, Ricardo Rocha
Added 30 Jun 2010
Updated 30 Jun 2010
Type Conference
Year 2004
Where SAC
Authors João Gama, Pedro Medas, Ricardo Rocha
Comments (0)