Sciweavers

SAC
2006
ACM

Discretization from data streams: applications to histograms and data mining

13 years 10 months ago
Discretization from data streams: applications to histograms and data mining
Abstract. In this paper we propose a new method to perform incremental discretization. The basic idea is to perform the task in two layers. The first layer receives the sequence of input data and keeps some statistics on the data using much more intervals than required. Based on the statistics stored by the first layer, a second layer creates the final discretization. The proposed architecture process streaming examples in a single scan, in constant time and space even for infinite sequences of examples. We experimentally demonstrate that incremental discretization is able to maintain the performance of learning algorithms in comparison to a batch discretization. The proposed method is much more appropriate in incremental learning, and in problems where data flows continuously as in most of recent data mining applications.
João Gama, Carlos Pinto
Added 14 Jun 2010
Updated 14 Jun 2010
Type Conference
Year 2006
Where SAC
Authors João Gama, Carlos Pinto
Comments (0)