Approximating pairwise, or k-wise, independence with sublinear memory is of considerable importance in the data stream model. In the streaming model the joint distribution is give...
We present a new family of linear time algorithms based on sufficient statistics for string comparison with mismatches under the string kernels framework. Our algorithms improve t...
The growing use of information visualization tools and data mining algorithms stems from two separate lines of research. Information visualization researchers believe in the impor...
This paper is about non-approximate acceleration of high-dimensional nonparametric operations such as k nearest neighbor classifiers. We attempt to exploit the fact that even if w...
We develop exact and approximate algorithms for computing optimal separators and measuring the extent to which two point sets in d-dimensional space are separated, with respect to...