Sciweavers

326 search results - page 17 / 66
» Optimal crawling strategies for web search engines
Sort
View
ASWC
2006
Springer
15 years 1 months ago
Next Generation Semantic Web Applications
Watson is a gateway to the Semantic Web: it collects, analyzes and gives access to ontologies and semantic data available online. Its objective is to support the development of ne...
Enrico Motta, Marta Sabou
WWW
2007
ACM
15 years 10 months ago
Search engine retrieval of changing information
In this paper we analyze the Web coverage of three search engines, Google, Yahoo and MSN. We conducted a 15 month study collecting 15,770 Web content or information pages linked f...
Yang Sok Kim, Byeong Ho Kang, Paul Compton, Hirosh...
SPIRE
1999
Springer
15 years 1 months ago
CoBWeb - A Crawler for the Brazilian Web
One of the key components of current Web search engines is the document collector. This paper describes CoBWeb, an automatic document collector, whose architecture is distributed ...
Altigran Soares da Silva, Eveline A. Veloso, Paulo...
WWW
2007
ACM
15 years 10 months ago
A large-scale study of robots.txt
Search engines largely rely on Web robots to collect information from the Web. Due to the unregulated open-access nature of the Web, robot activities are extremely diverse. Such c...
Yang Sun, Ziming Zhuang, C. Lee Giles
ICDE
2002
IEEE
161views Database» more  ICDE 2002»
15 years 10 months ago
Design and Implementation of a High-Performance Distributed Web Crawler
Broad web search engines as well as many more specialized search tools rely on web crawlers to acquire large collections of pages for indexing and analysis. Such a web crawler may...
Vladislav Shkapenyuk, Torsten Suel