Natural language technologies have been long envisioned to play a crucial role in transitioning from the current Web to a more "semantic" Web. If anything, the significa...
Peter Mika, Massimiliano Ciaramita, Hugo Zaragoza,...
Large inverted indices are by now common in the construction of web-scale search engines. For faster access, inverted indices are indexed internally so that it is possible to skip...
A Bloom filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Although Bloom filters allow false positives, f...
We describe the QccPack software package, an open-source collection of library routines and utility programs for quantization, compression, and coding of data. QccPack is being wr...
Many large-scale Web applications that require ranked top-k retrieval are implemented using inverted indices. An inverted index represents a sparse term-document matrix, where non...
George Beskales, Marcus Fontoura, Maxim Gurevich, ...