Sciweavers

KDD
2012
ACM

The long and the short of it: summarising event sequences with serial episodes

11 years 6 months ago
The long and the short of it: summarising event sequences with serial episodes
An ideal outcome of pattern mining is a small set of informative patterns, containing no redundancy or noise, that identifies the key structure of the data at hand. Standard frequent pattern miners do not achieve this goal, as due to the pattern explosion typically very large numbers of highly redundant patterns are returned. We pursue the ideal for sequential data, by employing a pattern set mining approach—an approach where, instead of ranking patterns individually, we consider results as a whole. Pattern set mining has been successfully applied to transactional data, but has been surprisingly understudied for sequential data. In this paper, we employ the MDL principle to identify the set of sequential patterns that summarises the data best. In particular, we formalise how to encode sequential data using sets of serial episodes, and use the encoded length as a quality score. As search strategy, we propose two approaches: the first algorithm selects a good pattern set from a larg...
Nikolaj Tatti, Jilles Vreeken
Added 28 Sep 2012
Updated 28 Sep 2012
Type Journal
Year 2012
Where KDD
Authors Nikolaj Tatti, Jilles Vreeken
Comments (0)