Sciweavers

TKDE
2010

Dictionary-Based Compression for Long Time-Series Similarity

13 years 3 months ago
Dictionary-Based Compression for Long Time-Series Similarity
—Long time-series datasets are common in many domains, especially scientific domains. Applications in these fields often require comparing trajectories using similarity measures. Existing methods perform well for short time-series but their evaluation cost degrades rapidly for longer time-series. In this work, we develop a new time-series similarity measure called the Dictionary Compression Score (DCS) for determining time-series similarity. We also show that this method allows us to accurately and quickly calculate similarity for both short and long time-series. We use the well known Kolmogorov Complexity in information theory and the Lempel-Ziv compression framework as a basis to calculate similarity scores. We show that off-the-shelf compressors do not fair well for computing time-series similarity. To address this problem, we developed a novel dictionary-based compression technique to compute time-series similarity. We also develop heuristics to automatically identify suitable ...
Willis Lang, Michael D. Morse, Jignesh M. Patel
Added 31 Jan 2011
Updated 31 Jan 2011
Type Journal
Year 2010
Where TKDE
Authors Willis Lang, Michael D. Morse, Jignesh M. Patel
Comments (0)