We study several transparent techniques for scaling dynamic content web sites, and we evaluate their relative impact when used in combination. Full transparency implies strong dat...
The application of hardware-parameterized models to distributed systems can result in omission of key bottlenecks such as the full cost of inter-node communication in a shared mem...
Many real datasets have uncertain categorical attribute values that are only approximately measured or imputed. Uncertainty in categorical data is commonplace in many applications...
We propose efficient techniques for processing various TopK count queries on data with noisy duplicates. Our method differs from existing work on duplicate elimination in two sign...
Sunita Sarawagi, Vinay S. Deshpande, Sourabh Kasli...
We consider the problem of collectively approximating a set of sensor signals using the least amount of space so that any individual signal can be efficiently reconstructed within...