Sciweavers

1109 search results - page 27 / 222
» Crawling on web graphs
Sort
View
NSDI
2010
15 years 3 months ago
The Architecture and Implementation of an Extensible Web Crawler
Many Web services operate their own Web crawlers to discover data of interest, despite the fact that largescale, timely crawling is complex, operationally intensive, and expensive...
Jonathan M. Hsieh, Steven D. Gribble, Henry M. Lev...
122
Voted
WWW
2009
ACM
16 years 2 months ago
Data quality in web archiving
Web archives preserve the history of Web sites and have high long-term value for media and business analysts. Such archives are maintained by periodically re-crawling entire Web s...
Marc Spaniol, Dimitar Denev, Arturas Mazeika, Gerh...
ESWS
2008
Springer
15 years 3 months ago
Semantic Sitemaps: Efficient and Flexible Access to Datasets on the Semantic Web
Increasing amounts of RDF data are available on the Web for consumption by Semantic Web browsers and indexing by Semantic Web search engines. Current Semantic Web publishing practi...
Richard Cyganiak, Holger Stenzhorn, Renaud Delbru,...
DEBU
2002
116views more  DEBU 2002»
15 years 1 months ago
The Role of Web Services in Information Search
State-of-the-art Web search engines are inherently limited in their abilities to search information in Deep Web beyond portals. This paper discusses how Web services and Semantic-...
Jens Graupmann, Gerhard Weikum
WWW
2007
ACM
16 years 2 months ago
GigaHash: scalable minimal perfect hashing for billions of urls
A minimal perfect function maps a static set of keys on to the range of integers {0,1,2, ... , - 1}. We present a scalable high performance algorithm based on random graphs for ...
Kumar Chellapilla, Anton Mityagin, Denis Xavier Ch...