Topical query decomposition

9 years 2 months ago
Topical query decomposition
We introduce the problem of query decomposition, where we are given a query and a document retrieval system, and we want to produce a small set of queries whose union of resulting documents corresponds approximately to that of the original query. Ideally, these queries should represent coherent, conceptually well-separated topics. We provide an abstract formulation of the query decomposition problem, and we tackle it from two different perspectives. We first show how the problem can be instantiated as a specific variant of a set cover problem, for which we provide an efficient greedy algorithm. Next, we show how the same problem can be seen as a constrained clustering problem, with a very particular kind of constraint, i.e., clustering with predefined clusters. We develop a two-phase algorithm based on hierarchical agglomerative clustering followed by dynamic programming. Our experiments, conducted on a set of actual queries in a Web scale search engine, confirm the effectiveness of t...
Francesco Bonchi, Carlos Castillo, Debora Donato,
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2008
Where KDD
Authors Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis
Comments (0)