Sciweavers

LAWEB
2003
IEEE
13 years 10 months ago
Cooperation Schemes between a Web Server and a Web Search Engine
Search engines provide search results based on a large repository of pages downloaded by a web crawler from several servers. To provide best results, this repository must be kept ...
Carlos Castillo
WWW
2010
ACM
13 years 12 months ago
Not so creepy crawler: easy crawler generation with standard xml queries
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
WWW
2004
ACM
14 years 5 months ago
Design of a crawler with bounded bandwidth
This paper presents an algorithm to bound the bandwidth of a Web crawler. The crawler collects statistics on the transfer rate of each server to predict the expected bandwidth use...
Michelangelo Diligenti, Marco Maggini, Filippo Mar...
ICDE
2002
IEEE
161views Database» more  ICDE 2002»
14 years 6 months ago
Design and Implementation of a High-Performance Distributed Web Crawler
Broad web search engines as well as many more specialized search tools rely on web crawlers to acquire large collections of pages for indexing and analysis. Such a web crawler may...
Vladislav Shkapenyuk, Torsten Suel