Sciweavers

611 search results - page 1 / 123
» Random web crawls
Sort
View
WWW
2007
ACM
14 years 5 months ago
Random web crawls
This paper proposes a random Web crawl model. A Web crawl is a (biased and partial) image of the Web. This paper deals with the hyperlink structure, i.e. a Web crawl is a graph, w...
Toufik Bennouas, Fabien de Montgolfier
IC
2004
13 years 5 months ago
IPMicra: An IP-address based Location Aware Distributed Web Crawler
Distributed crawling is able to overcome important limitations of the traditional single-sourced web crawling systems. However, the optimal benefit of distributed crawling is usual...
Odysseas Papapetrou, George Samaras
WIDM
2004
ACM
13 years 9 months ago
Probabilistic models for focused web crawling
A Focused crawler must use information gleaned from previously crawled page sequences to estimate the relevance of a newly seen URL. Therefore, good performance depends on powerfu...
Hongyu Liu, Evangelos E. Milios, Jeannette Janssen
ICDE
2005
IEEE
148views Database» more  ICDE 2005»
13 years 10 months ago
Simulation Study of Language Specific Web Crawling
The Web has been recognized as an important part of our cultural heritage. Many nations started archiving national web spaces for future generations. A key technology for data acqu...
Kulwadee Somboonviwat, Masaru Kitsuregawa, Takayuk...
ICIW
2009
IEEE
13 years 2 months ago
Utilizing RSS Feeds for Crawling the Web
We present "advaRSS" crawling mechanism which is created in order to support peRSSonal, a mechanism used to create personalized RSS feeds. In contrast to the common crawl...
George Adam, Christos Bouras, Vassilis Poulopoulos