Sciweavers

9 search results - page 2 / 2
» A Query-Dependent Duplicate Detection Approach for Large Sca...
Sort
View
WWW
2010
ACM
14 years 25 days ago
Large-scale bot detection for search engines
In this paper, we propose a semi-supervised learning approach for classifying program (bot) generated web search traffic from that of genuine human users. The work is motivated by...
Hongwen Kang, Kuansan Wang, David Soukal, Fritz Be...
CPM
2000
Springer
177views Combinatorics» more  CPM 2000»
13 years 10 months ago
Identifying and Filtering Near-Duplicate Documents
Abstract. The mathematical concept of document resemblance captures well the informal notion of syntactic similarity. The resemblance can be estimated using a fixed size “sketch...
Andrei Z. Broder
PAKDD
2009
ACM
120views Data Mining» more  PAKDD 2009»
14 years 3 months ago
Detecting Link Hijacking by Web Spammers.
Abstract. Since current search engines employ link-based ranking algorithms as an important tool to decide a ranking of sites, Web spammers are making a significant effort to man...
Masaru Kitsuregawa, Masashi Toyoda, Young-joo Chun...
ICDE
2009
IEEE
251views Database» more  ICDE 2009»
14 years 7 months ago
Contextual Ranking of Keywords Using Click Data
The problem of automatically extracting the most interesting and relevant keyword phrases in a document has been studied extensively as it is crucial for a number of applications. ...
Utku Irmak, Vadim von Brzeski, Reiner Kraft