Sciweavers

WWW
2005
ACM

Focused crawling by exploiting anchor text using decision tree

14 years 5 months ago
Focused crawling by exploiting anchor text using decision tree
Focused crawlers are considered as a promising way to tackle the scalability problem of topic-oriented or personalized search engines. To design a focused crawler, the choice of strategy for prioritizing unvisited URLs is crucial. In this paper, we propose a method using a decision tree on anchor texts of hyperlinks. We conducted experiments on the real data sets of four Japanese universities and verified our approach. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval--search process; I.2.8 [Artificial Intelligence]: Problem Solving, Control Methods, and Search-graph and tree search strategies General Terms Algorithms, Experimentation, Performance Keywords Focused Crawling, Anchor Text, Decision Tree Learning, Shortest Path
Jun Li, Kazutaka Furuse, Kazunori Yamaguchi
Added 22 Nov 2009
Updated 22 Nov 2009
Type Conference
Year 2005
Where WWW
Authors Jun Li, Kazutaka Furuse, Kazunori Yamaguchi
Comments (0)