Durable top-k search in document archives

9 years 4 months ago
Durable top-k search in document archives
We propose and study a new ranking problem in versioned databases. Consider a database of versioned objects which have different valid instances along a history (e.g., documents in a web archive). Durable top-k search finds the set of objects that are consistently in the top-k results of a query (e.g., a keyword query) throughout a given time interval (e.g., from June 2008 to May 2009). Existing work on temporal top-k queries mainly focuses on finding the most representative top-k elements within a time interval. Such methods are not readily applicable to durable top-k queries. To address this need, we propose two techniques that compute the durable top-k result. The first is adapted from the classic top-k rank aggregation algorithm NRA. The second technique is based on a shared execution paradigm and is more efficient than the first approach. In addition, we propose a special indexing technique for archived data. The index, coupled with a space partitioning technique, improves perfor...
Leong Hou U, Nikos Mamoulis, Klaus Berberich, Srik
Added 21 May 2011
Updated 21 May 2011
Type Journal
Year 2010
Authors Leong Hou U, Nikos Mamoulis, Klaus Berberich, Srikanta J. Bedathur
Comments (0)