Summarization is an important task in data mining. A major challenge over the past years has been the efficient construction of fixed-space synopses that provide a deterministic q...
In the life sciences, genomic databases for sequence search have been growing exponentially in size. As a result, faster sequencesearch algorithms to search these databases contin...
Oystein Thorsen, Brian E. Smith, Carlos P. Sosa, K...
— One of the central problems for data quality is inconsistency detection. Given a database D and a set Σ of dependencies as data quality rules, we want to identify tuples in D ...
Histograms are typically used to approximate data distributions. Histograms and related synopsis structures have been successful in a wide variety of popular database applications...
Many parallel join algorithms have been proposed in the last several years. However, most of these algorithms require that the amount of data to be joined is known in advance in o...