This paper considers online compression algorithms that use at most polylogarithmic space (plogon). These algorithms correspond to compressors in the data stream model. We study th...
Most time series data mining algorithms use similarity search as a core subroutine, and thus the time taken for similarity search is the bottleneck for virtually all time series d...
Thanawin Rakthanmanon, Bilson J. L. Campana, Abdul...
While scalable data mining methods are expected to cope with massive Web data, coping with evolving trends in noisy data in a continuous fashion, and without any unnecessary stopp...
We study the problem of finding frequent items in a continuous stream of itemsets. A new frequency measure is introduced, based on a flexible window length. For a given item, its ...
Random sampling is one of the most fundamental data management tools available. However, most current research involving sampling considers the problem of how to use a sample, and...