Sciweavers

CIKM
2009
Springer
13 years 11 months ago
Graph-based seed selection for web-scale crawlers
One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...
Shuyi Zheng, Pavel Dmitriev, C. Lee Giles
CIKM
2009
Springer
13 years 11 months ago
Exploiting bidirectional links: making spamming detection easier
Previous anti-spamming algorithms based on link structure suffer from either the weakness of the page value metric or the vagueness of the seed selection. In this paper, we propos...
Yan Zhang, Qiancheng Jiang, Lei Zhang, Yizhen Zhu