Sciweavers

35 search results - page 6 / 7
» Web spam identification through content and hyperlinks
Sort
View
CHI
1993
ACM
13 years 10 months ago
Hyperspeech
HTTP provides a mechanism to connect web sites. Almost all sites have a large amount of hypertext content that provides connection to other sites in the World Wide Web. The succes...
Barry Arons
WWW
2006
ACM
14 years 6 months ago
An audio/video analysis mechanism for web indexing
The high availability of video streams is making necessary mechanisms for indexing such contents in the Web world. In this paper we focus on news programs and we propose a mechani...
Marco Furini, Marco Aragone
VLDB
2005
ACM
177views Database» more  VLDB 2005»
13 years 11 months ago
Discovering Large Dense Subgraphs in Massive Graphs
We present a new algorithm for finding large, dense subgraphs in massive graphs. Our algorithm is based on a recursive application of fingerprinting via shingles, and is extreme...
David Gibson, Ravi Kumar, Andrew Tomkins
SIGIR
2010
ACM
13 years 10 months ago
Adaptive near-duplicate detection via similarity learning
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
INTR
2002
50views more  INTR 2002»
13 years 5 months ago
Methodologies for crawler based Web surveys
There have been many attempts to study the content of the web, either through human or automatic agents. Five different previously used web survey methodologies are described and ...
Mike Thelwall