First-order focused crawling

13 years 6 months ago
First-order focused crawling
This paper reports a new general framework of focused web crawling based on "relational subgroup discovery". Predicates are used explicitly to represent the relevance clues of those unvisited pages in the crawl frontier, and then firstorder classification rules are induced using subgroup discovery technique. The learned relational rules with sufficient support and confidence will guide the crawling process afterwards. We present the many interesting features of our proposed first-order focused crawler, together with preliminary promising experimental results. Categories and Subject Descriptors: H.5.4 [Information interfaces and presentation]: Hypertext/hypermedia; I.2.6 [Artificial intelligence]: Learning General Terms: Algorithms, performance, measurements
Qingyang Xu, Wanli Zuo
Added 21 Nov 2009
Updated 21 Nov 2009
Type Conference
Year 2007
Where WWW
Authors Qingyang Xu, Wanli Zuo
Comments (0)