Sciweavers

611 search results - page 3 / 123
» Random web crawls
Sort
View
WWW
2001
ACM
14 years 5 months ago
Intelligent crawling on the World Wide Web with arbitrary predicates
The enormous growth of the world wide web in recent years has made it important to perform resource discovery e ciently. Consequently, several new ideas have been proposed in rece...
Charu C. Aggarwal, Fatima Al-Garawi, Philip S. Yu
VLDB
2000
ACM
125views Database» more  VLDB 2000»
13 years 8 months ago
Focused Crawling Using Context Graphs
Maintaining currency of search engine indices by exhaustive crawling is rapidly becoming impossible due to the increasing size and dynamic content of the web. Focused crawlers aim...
Michelangelo Diligenti, Frans Coetzee, Steve Lawre...
STACS
2009
Springer
13 years 11 months ago
A Comparison of Techniques for Sampling Web Pages
As the World Wide Web is growing rapidly, it is getting increasingly challenging to gather representative information about it. Instead of crawling the web exhaustively one has to...
Eda Baykan, Monika Rauch Henzinger, Stefan F. Kell...
IC
2009
13 years 2 months ago
Language Based Crawling: Crawling the Arabic Content of the Web
- Crawling web pages written in Arabic or any other language with limited content in the web may, at first, seem to parallel the process of crawling the English content. However, t...
Saad H. Alabbad, Sultan Alanazi
ICDM
2008
IEEE
186views Data Mining» more  ICDM 2008»
13 years 11 months ago
xCrawl: A High-Recall Crawling Method for Web Mining
Web Mining Systems exploit the redundancy of data published on the Web to automatically extract information from existing web documents. The first step in the Information Extract...
Kostyantyn M. Shchekotykhin, Dietmar Jannach, Gerh...