Sciweavers

Share
AIRS
2004
Springer

Effective Topic Distillation with Key Resource Pre-selection

9 years 1 months ago
Effective Topic Distillation with Key Resource Pre-selection
Topic distillation aims at finding key resources which are high-quality pages for certain topics. With analysis in non-content features of key resources, a pre-selection method is introduced in topic distillation research. A decision tree is constructed to locate key resource pages using query-independent non-content features including in-degree, document length, URL-type and two new features we found out involving site's self-link structure analysis. Although the result page set contains only about 20% pages of the whole collection, it covers more than 70% of key resources. Furthermore, information retrieval on this page set makes more than 60% improvement with respect to that on all pages. These results were achieved using TREC 2002 web track topic distillation task for training and TREC 2003 corresponding task for testing. It shows an effective way of getting better performance in topic distillation with a dataset significantly smaller in size.1
Yiqun Liu, Min Zhang, Shaoping Ma
Added 20 Aug 2010
Updated 20 Aug 2010
Type Conference
Year 2004
Where AIRS
Authors Yiqun Liu, Min Zhang, Shaoping Ma
Comments (0)
books