Sciweavers

10 search results - page 1 / 2
» Graph based crawler seed selection
Sort
View
WWW
2009
ACM
14 years 5 months ago
Graph based crawler seed selection
This paper identifies and explores the problem of seed selection in a web-scale crawler. We argue that seed selection is not a trivial but very important problem. Selecting proper...
Shuyi Zheng, Pavel Dmitriev, C. Lee Giles
CIKM
2009
Springer
13 years 11 months ago
Graph-based seed selection for web-scale crawlers
One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...
Shuyi Zheng, Pavel Dmitriev, C. Lee Giles
AIRS
2008
Springer
13 years 10 months ago
News Page Discovery Policy for Instant Crawlers
Many news pages which are of high freshness requirement are published on the internet every day. They should be downloaded immediately by instant crawlers. Otherwise, they will bec...
Yong Wang, Yiqun Liu, Min Zhang, Shaoping Ma
VLDB
2004
ACM
113views Database» more  VLDB 2004»
13 years 9 months ago
Accurate and Efficient Crawling for Relevant Websites
Focused web crawlers have recently emerged as an alternative to the well-established web search engines. While the well-known focused crawlers retrieve relevant webpages, there ar...
Martin Ester, Hans-Peter Kriegel, Matthias Schuber...
ECAI
2008
Springer
13 years 6 months ago
Reinforcement Learning with Classifier Selection for Focused Crawling
Focused crawlers are programs that wander in the Web, using its graph structure, and gather pages that belong to a specific topic. The most critical task in Focused Crawling is the...
Ioannis Partalas, Georgios Paliouras, Ioannis P. V...