We present a generalization of frequent itemsets allowing the notion of errors in the itemset definition. We motivate the problem and present an efficient algorithm that identifie...
Time series data poses a significant variation to the traditional segmentation techniques of data mining because the observation is derived from multiple instances of the same und...
Historical prices are important information that can help consumers decide whether the time is right to buy a product. They provide both a context to the users, and facilitate the...
We present a new automated white box fuzzing technique and a tool, BuzzFuzz, that implements this technique. Unlike standard fuzzing techniques, which randomly change parts of the...
This paper explores unexpected results that lie at the intersection of two common themes in the KDD community: large datasets and the goal of building compact models. Experiments ...