Sciweavers

SIGIR
2009
ACM

Building enriched document representations using aggregated anchor text

14 years 2 months ago
Building enriched document representations using aggregated anchor text
It is well known that anchor text plays a critical role in a variety of search tasks performed over hypertextual domains, including enterprise search, wiki search, and web search. It is common practice to enrich a document’s standard textual representation with all of the anchor text associated with its incoming hyperlinks. However, this approach does not help match relevant pages with very few inlinks. In this paper, we propose a method for overcoming anchor text sparsity by enriching document representations with anchor text that has been aggregated across the hyperlink graph. This aggregation mechanism acts to smooth, or diffuse, anchor text within a domain. We rigorously evaluate our proposed approach on a large web search test collection. Our results show the approach significantly improves retrieval effectiveness, especially for longer, more difficult queries. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval Gene...
Donald Metzler, Jasmine Novak, Hang Cui, Srihari R
Added 28 May 2010
Updated 28 May 2010
Type Conference
Year 2009
Where SIGIR
Authors Donald Metzler, Jasmine Novak, Hang Cui, Srihari Reddy
Comments (0)