Sciweavers

KDD
2007
ACM

Exploiting duality in summarization with deterministic guarantees

14 years 4 months ago
Exploiting duality in summarization with deterministic guarantees
Summarization is an important task in data mining. A major challenge over the past years has been the efficient construction of fixed-space synopses that provide a deterministic quality guarantee, often expressed in terms of a maximum-error metric. Histograms and several hierarchical techniques have been proposed for this problem. However, their time and/or space complexities remain impractically high and depend not only on the data set size n, but also on the space budget B. These handicaps stem from a requirement to tabulate all allocations of synopsis space to different regions of the data. In this paper we develop an alternative methodology that dispels these deficiencies, thanks to a fruitful application of the solution to the dual problem: given a maximum allowed error, determine the minimum-space synopsis that achieves it. Compared to the state-of-the-art, our histogram construction algorithm reduces time complexity by (at least) a B log2 n log factor and our hierarchical syno...
Panagiotis Karras, Dimitris Sacharidis, Nikos Mamo
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2007
Where KDD
Authors Panagiotis Karras, Dimitris Sacharidis, Nikos Mamoulis
Comments (0)