Sciweavers

WWW
2003
ACM
14 years 5 months ago
Detecting Near-replicas on the Web by Content and Hyperlink Analysis
The presence of replicas or near-replicas of documents is very common on the Web. Documents may be replicated completely or partially for different reasons (versions, mirrors, etc...
Ernesto Di Iorio, Michelangelo Diligenti, Marco Go...