Sciweavers

179 search results - page 23 / 36
» Improvement of HITS-based algorithms on web documents
Sort
View
VLDB
2000
ACM
125views Database» more  VLDB 2000»
15 years 1 months ago
Focused Crawling Using Context Graphs
Maintaining currency of search engine indices by exhaustive crawling is rapidly becoming impossible due to the increasing size and dynamic content of the web. Focused crawlers aim...
Michelangelo Diligenti, Frans Coetzee, Steve Lawre...
WWW
2004
ACM
15 years 10 months ago
Using urls and table layout for web classification tasks
We propose new features and algorithms for automating Web-page classification tasks such as content recommendation and ad blocking. We show that the automated classification of We...
L. K. Shih, David R. Karger
WEBI
2009
Springer
15 years 4 months ago
Full-Subtopic Retrieval with Keyphrase-Based Search Results Clustering
We consider the problem of retrieving multiple documents relevant to the single subtopics of a given web query, termed “full-subtopic retrieval”. To solve this problem we pres...
Andrea Bernardini, Claudio Carpineto, Massimiliano...
TREC
2003
14 years 10 months ago
Relevance Propagation for Topic Distillation UIUC TREC 2003 Web Track Experiments
In this paper, we report our experiments on the Web Track TREC-2003. We submitted five runs for the topic distillation task. Our goal was to evaluate the standard language modeli...
Azadeh Shakery, ChengXiang Zhai
SAC
2006
ACM
14 years 9 months ago
Undue influence: eliminating the impact of link plagiarism on web search rankings
Link farm spam and replicated pages can greatly deteriorate link-based ranking algorithms like HITS. In order to identify and neutralize link farm spam and replicated pages, we lo...
Baoning Wu, Brian D. Davison