Sciweavers

125 search results - page 2 / 25
» Minimizing the Network Distance in Distributed Web Crawling
Sort
View
SIGIR
2003
ACM
13 years 11 months ago
Apoidea: A Decentralized Peer-to-Peer Architecture for Crawling the World Wide Web
This paper describes a decentralized peer-to-peer model for building a Web crawler. Most of the current systems use a centralized client-server model, in which the crawl is done by...
Aameek Singh, Mudhakar Srivatsa, Ling Liu, Todd Mi...
CORR
2012
Springer
292views Education» more  CORR 2012»
12 years 1 months ago
Optimal Threshold Control by the Robots of Web Search Engines with Obsolescence of Documents
A typical web search engine consists of three principal parts: crawling engine, indexing engine, and searching engine. The present work aims to optimize the performance of the cra...
Konstantin Avrachenkov, Alexander N. Dudin, Valent...
ICDE
2002
IEEE
161views Database» more  ICDE 2002»
14 years 7 months ago
Design and Implementation of a High-Performance Distributed Web Crawler
Broad web search engines as well as many more specialized search tools rely on web crawlers to acquire large collections of pages for indexing and analysis. Such a web crawler may...
Vladislav Shkapenyuk, Torsten Suel
DEXAW
2010
IEEE
181views Database» more  DEXAW 2010»
13 years 7 months ago
Towards a Search System for the Web Exploiting Spatial Data of a Web Document
In this paper, we describe our work in progress in the scope of information retrieval exploiting the spatial data extracted from web documents. We discuss problems of a search for ...
Stefan Dlugolinsky, Michal Laclavik, Ladislav Hluc...
CIKM
2005
Springer
13 years 11 months ago
Focused crawling for both topical relevance and quality of medical information
Subject-specific search facilities on health sites are usually built using manual inclusion and exclusion rules. These can be expensive to maintain and often provide incomplete c...
Thanh Tin Tang, David Hawking, Nick Craswell, Kath...