Streaming Random Forests

10 years 8 months ago
Streaming Random Forests
Many recent applications deal with data streams, conceptually endless sequences of data records, often arriving at high flow rates. Standard data-mining techniques typically assume that records can be accessed multiple times and so do not naturally extend to streaming data. Algorithms for mining streams must be able to extract all necessary information from records with only one, or perhaps a few, passes over the data. We present the Streaming Random Forests algorithm, an online and incremental stream classification algorithm that extends Breiman’s Random Forests algorithm. The Streaming Random Forests algorithm grows multiple decision trees, and classifies unlabelled records based on the plurality of tree votes. We evaluate the classification accuracy of the Streaming Random Forests algorithm on several datasets, and show that its accuracy is comparable to the standard Random Forest algorithm. Keywords Data mining, Classification, Decision trees, Data-stream classification, R...
Hanady Abdulsalam, David B. Skillicorn, Patrick Ma
Added 03 Jun 2010
Updated 21 Feb 2012
Type Conference
Year 2007
Authors Hanady Abdulsalam, David B. Skillicorn, Patrick Martin
Comments (0)