Sciweavers

ACL
2009

A Generative Blog Post Retrieval Model that Uses Query Expansion based on External Collections

13 years 2 months ago
A Generative Blog Post Retrieval Model that Uses Query Expansion based on External Collections
User generated content is characterized by short, noisy documents, with many spelling errors and unexpected language usage. To bridge the vocabulary gap between the user's information need and documents in a specific user generated content environment, the blogosphere, we apply a form of query expansion, i.e., adding and reweighing query terms. Since the blogosphere is noisy, query expansion on the collection itself is rarely effective but external, edited collections are more suitable. We propose a generative model for expanding queries using external collections in which dependencies between queries, documents, and expansion documents are explicitly modeled. Different instantiations of our model are discussed and make different (in)dependence assumptions. Results using two external collections (news and Wikipedia) show that external expansion for retrieval of user generated content is effective; besides, conditioning the external collection on the query is very beneficial, and ...
Wouter Weerkamp, Krisztian Balog, Maarten de Rijke
Added 16 Feb 2011
Updated 16 Feb 2011
Type Journal
Year 2009
Where ACL
Authors Wouter Weerkamp, Krisztian Balog, Maarten de Rijke
Comments (0)