Sciweavers

1109 search results - page 22 / 222
» Crawling on web graphs
Sort
View
CIKM
2005
Springer
15 years 7 months ago
Focused crawling for both topical relevance and quality of medical information
Subject-specific search facilities on health sites are usually built using manual inclusion and exclusion rules. These can be expensive to maintain and often provide incomplete c...
Thanh Tin Tang, David Hawking, Nick Craswell, Kath...
80
Voted
CIKM
2009
Springer
15 years 8 months ago
Graph-based seed selection for web-scale crawlers
One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...
Shuyi Zheng, Pavel Dmitriev, C. Lee Giles
WEBI
2007
Springer
15 years 8 months ago
Question Answering over Implicitly Structured Web Content
Implicitly structured content on the Web such as HTML tables and lists can be extremely valuable for web search, question answering, and information retrieval, as the implicit str...
Eugene Agichtein, Chris Burges, Eric Brill
99
Voted
GCC
2005
Springer
15 years 7 months ago
Parallel Web Spiders for Cooperative Information Gathering
Web spider is a widely used approach to obtain information for search engines. As the size of the Web grows, it becomes a natural choice to parallelize the spider’s crawling proc...
Jiewen Luo, Zhongzhi Shi, Maoguang Wang, Wei Wang
ICML
2007
IEEE
16 years 2 months ago
Focused crawling with scalable ordinal regression solvers
In this paper we propose a novel, scalable, clustering based Ordinal Regression formulation, which is an instance of a Second Order Cone Program (SOCP) with one Second Order Cone ...
Rashmin Babaria, J. Saketha Nath, S. Krishnan, K. ...