Sciweavers

59 search results - page 2 / 12
» Detecting phrase-level duplication on the world wide web
Sort
View
WWW
2010
ACM
14 years 6 days ago
Topic initiator detection on the world wide web
Xin Jin, Scott Spangler, Rui Ma, Jiawei Han
WEBNET
1996
13 years 6 months ago
Information fusion with ProFusion
: The explosive growth of the World Wide Web, and the resulting information overload, has led to a miniexplosion in World Wide Web search engines. This mini-explosion, in turn, led...
Susan Gauch, Guijun Wang
LAWEB
2003
IEEE
13 years 10 months ago
On the Evolution of Clusters of Near-Duplicate Web Pages
This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis ove...
Dennis Fetterly, Mark Manasse, Marc Najork
HUMAN
2005
Springer
13 years 10 months ago
How to Evaluate the Effectiveness of URL Normalizations
Syntactically different URLs could represent the same web page on the World Wide Web, and duplicate representation for web pages causes web applications to handle a large amount of...
Sang Ho Lee, Sung Jin Kim, Hyo Sook Jeong
DASFAA
2007
IEEE
143views Database» more  DASFAA 2007»
13 years 11 months ago
Using Redundant Bit Vectors for Near-Duplicate Image Detection
Images are amongst the most widely proliferated form of digital information due to affordable imaging technologies and the Web. In such an environment, the use of digital watermar...
Jun Jie Foo, Ranjan Sinha