Estimating the cardinality (i.e. number of distinct elements) of an arbitrary set expression defined over multiple distributed streams is one of the most fundamental queries of in...
Two central criteria for data quality are consistency and accuracy. Inconsistencies and errors in a database often emerge as violations of integrity constraints. Given a dirty dat...
Recent research studied the problem of publishing microdata without revealing sensitive information, leading to the privacy preserving paradigms of k-anonymity and -diversity. k-a...
Gabriel Ghinita, Panagiotis Karras, Panos Kalnis, ...
We present STAR, a self-tuning algorithm that adaptively sets numeric precision constraints to accurately and efficiently answer continuous aggregate queries over distributed data...
Navendu Jain, Michael Dahlin, Yin Zhang, Dmitry Ki...
Existing work on similar sequence matching has focused on either whole matching or range subsequence matching. In this paper, we present novel methods for ranked subsequence match...