In this paper, we study the problem of learning in the presence of classification noise in the probabilistic learning model of Valiant and its variants. In order to identify the cl...
The sparse data is becoming increasingly common and available in many real-life applications. However, relative little attention has been paid to effectively model the sparse data ...
The All Nearest Neighbor (ANN) operation is a commonly used primitive for analyzing large multi-dimensional datasets. Since computing ANN is very expensive, in previous works R*-t...
Nearest Neighbor (NN) retrieval is a crucial tool of many computer vision tasks. Since the brute-force naive search is too time consuming for most applications, several tailored d...
We propose efficient techniques for processing various TopK count queries on data with noisy duplicates. Our method differs from existing work on duplicate elimination in two sign...
Sunita Sarawagi, Vinay S. Deshpande, Sourabh Kasli...