Sciweavers

101 search results - page 11 / 21
» First-order focused crawling
Sort
View
87
Voted
WEBDB
2005
Springer
129views Database» more  WEBDB 2005»
15 years 3 months ago
Searching for Hidden-Web Databases
Recently, there has been increased interest in the retrieval and integration of hidden Web data with a view to leverage high-quality information available in online databases. Alt...
Luciano Barbosa, Juliana Freire
84
Voted
WWW
2009
ACM
15 years 11 months ago
User-centric content freshness metrics for search engines
In order to return relevant search results, a search engine must keep its local repository synchronized to the Web, but it is usually impossible to attain perfect freshness. Hence...
Ali Dasdan, Xinh Huynh
WWW
2007
ACM
15 years 11 months ago
A large-scale study of robots.txt
Search engines largely rely on Web robots to collect information from the Web. Due to the unregulated open-access nature of the Web, robot activities are extremely diverse. Such c...
Yang Sun, Ziming Zhuang, C. Lee Giles
68
Voted
WWW
2004
ACM
15 years 11 months ago
Outlink estimation for pagerank computation under missing data
The enormity and rapid growth of the web-graph forces quantities such as its pagerank to be computed under missing information consisting of outlinks of pages that have not yet be...
Sreangsu Acharyya, Joydeep Ghosh
81
Voted
WEBDB
2005
Springer
102views Database» more  WEBDB 2005»
15 years 3 months ago
Design and Implementation of a Geographic Search Engine
In this paper, we describe the design and initial implementation of a geographic search engine prototype for Germany, based on a large crawl of the de domain. Geographic search en...
Alexander Markowetz, Yen-Yu Chen, Torsten Suel, Xi...