Faster top-k document retrieval using block-max indexes

10 years 2 months ago
Faster top-k document retrieval using block-max indexes
Large search engines process thousands of queries per second over billions of documents, making query processing a major performance bottleneck. An important class of optimization techniques called early termination achieves faster query processing by avoiding the scoring of documents that are unlikely to be in the top results. We study new algorithms for early termination that outperform previous methods. In particular, we focus on safe techniques for disjunctive queries, which return the same result as an exhaustive evaluation over the disjunction of the query terms. The current state-of-the-art methods for this case, the WAND algorithm by Broder et al. [11] and the approach of Strohman and Croft [30], achieve great benefits but still leave a large performance gap between disjunctive and (even non-early terminated) conjunctive queries. We propose a new set of algorithms by introducing a simple augmented inverted index structure called a block-max index. Essentially, this is a struc...
Shuai Ding, Torsten Suel
Added 17 Sep 2011
Updated 17 Sep 2011
Type Journal
Year 2011
Authors Shuai Ding, Torsten Suel
Comments (0)