Sciweavers

6 search results - page 1 / 2
» Evaluation of crawling policies for a web-repository crawler
Sort
View
HT
2006
ACM
13 years 11 months ago
Evaluation of crawling policies for a web-repository crawler
We have developed a web-repository crawler that is used for reconstructing websites when backups are unavailable. Our crawler retrieves web resources from the Internet Archive, Go...
Frank McCown, Michael L. Nelson
WIDM
2006
ACM
13 years 11 months ago
Lazy preservation: reconstructing websites by crawling the crawlers
Backup of websites is often not considered until after a catastrophic event has occurred to either the website or its webmaster. We introduce “lazy preservation” – digital p...
Frank McCown, Joan A. Smith, Michael L. Nelson
CIKM
2011
Springer
12 years 4 months ago
Focusing on novelty: a crawling strategy to build diverse language models
Word prediction performed by language models has an important role in many tasks as e.g. word sense disambiguation, speech recognition, hand-writing recognition, query spelling an...
Luciano Barbosa, Srinivas Bangalore
WWW
2008
ACM
14 years 5 months ago
Low-load server crawler: design and evaluation
This paper proposes a method of crawling Web servers connected to the Internet without imposing a high processing load. We are using the crawler for a field survey of the digital ...
Katsuko T. Nakahira, Tetsuya Hoshino, Yoshiki Mika...
CORR
2012
Springer
292views Education» more  CORR 2012»
12 years 19 days ago
Optimal Threshold Control by the Robots of Web Search Engines with Obsolescence of Documents
A typical web search engine consists of three principal parts: crawling engine, indexing engine, and searching engine. The present work aims to optimize the performance of the cra...
Konstantin Avrachenkov, Alexander N. Dudin, Valent...