Sciweavers

ICML
1999
IEEE

Using Reinforcement Learning to Spider the Web Efficiently

14 years 5 months ago
Using Reinforcement Learning to Spider the Web Efficiently
Consider the task of exploring the Web in order to find pages of a particular kind or on a particular topic. This task arises in the construction of search engines and Web knowledge bases. This paper argues that the creation of efficient web spiders is best framed and solved by reinforcement learning, a branch of machine learning that concerns itself with optimal sequential decision making. One strength of reinforcement learning is that it provides a formalism for measuring the utility of actions that give benefit only in the future. We present an algorithm for learning a value function that maps hyperlinks to future discounted reward using a naive Bayes text classifier. Experiments on two real-world spidering tasks show a threefold improvement in spidering efficiency over traditional breadth-first search, and up to a two-fold improvement over reinforcement learning with immediate reward only.
Jason Rennie, Andrew McCallum
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 1999
Where ICML
Authors Jason Rennie, Andrew McCallum
Comments (0)