Sciweavers

1109 search results - page 2 / 222
» Crawling on web graphs
Sort
View
SIGIR
2008
ACM
13 years 5 months ago
Compressed collections for simulated crawling
Collections are a fundamental tool for reproducible evaluation of information retrieval techniques. We describe a new method for distributing the document lengths and term counts ...
Alessio Orlandi, Sebastiano Vigna
ICDE
2006
IEEE
146views Database» more  ICDE 2006»
14 years 6 months ago
Query Selection Techniques for Efficient Crawling of Structured Web Sources
The high quality, structured data from Web structured sources is invaluable for many applications. Hidden Web databases are not directly crawlable by Web search engines and are on...
Ping Wu, Ji-Rong Wen, Huan Liu, Wei-Ying Ma
WIDM
2004
ACM
13 years 10 months ago
Probabilistic models for focused web crawling
A Focused crawler must use information gleaned from previously crawled page sequences to estimate the relevance of a newly seen URL. Therefore, good performance depends on powerfu...
Hongyu Liu, Evangelos E. Milios, Jeannette Janssen
ICDE
2006
IEEE
144views Database» more  ICDE 2006»
13 years 11 months ago
Finding Thai Web Pages in Foreign Web Spaces
While the Web has been increasingly recognized as a culturally valuable social artifact, many nations endeavor to create national Web archives for long term preservation. However, ...
Kulwadee Somboonviwat, Takayuki Tamura, Masaru Kit...
DASFAA
2007
IEEE
181views Database» more  DASFAA 2007»
13 years 11 months ago
Graph Structure of the Korea Web
The study of the Web graph not only yields valuable insight into Web algorithms for crawling, searching and community discovery, and the sociological phenomena that characterize it...
In Kyu Han, Sang Ho Lee, Soowon Lee