Data quality is critical for many information-intensive applications. One of the best opportunities to improve data quality is during entry. USHER provides a theoretical, data-dri...
Kuang Chen, Joseph M. Hellerstein, Tapan S. Parikh
Central to a data cleaning system are record matching and data repairing. Matching aims to identify tuples that refer to the same real-world object, and repairing is to make a dat...
Wenfei Fan, Jianzhong Li, Shuai Ma, Nan Tang, Weny...
Abstract--The Lagrangian formulation of variable-rate vector quantization is known to yield useful necessary conditions for quantizer optimality and generalized Lloyd algorithms fo...
Recent years have witnessed an increasing interest in designing algorithms for querying and analyzing streaming data (i.e., data that is seen only once in a fixed order) with only...
Alin Dobra, Minos N. Garofalakis, Johannes Gehrke,...
Many time series exhibit dynamics over vastly different time scales. The standard way to capture this behavior is to assume that the slow dynamics are a “trend”, to de-trend t...