Although many variants of language models have been proposed for information retrieval, there are two related retrieval heuristics remaining “external” to the language modelin...
Abstract. Speeding up approximate pattern matching is a line of research in stringology since the 80’s. Practically fast approaches belong to the class of filtration algorithms,...
Classic algorithms for sequential pattern discovery, return all frequent sequences present in a database. Since, in general, only a few ones are interesting from a user's poin...
Background: With the exponential increase in genomic sequence data there is a need to develop automated approaches to deducing the biological functions of novel sequences with hig...
We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding wind...
Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev...