Sciweavers

P2P
2010
IEEE

Optimizing Near Duplicate Detection for P2P Networks

13 years 2 months ago
Optimizing Near Duplicate Detection for P2P Networks
—In this paper, we propose a probabilistic algorithm for detecting near duplicate text, audio, and video resources efficiently and effectively in large-scale P2P systems. To this end, we present a thorough cost and probabilistic analysis that allows the algorithm to adapt to network and data collection characteristics for minimizing network cost. In addition, we extend the algorithm so that it can identify similar videos, even if some of the videos are split into different files. A thorough theoretical analysis as well as a large-scale experimental evaluation on networks of up to 100,000 peers using real-world datasets of more than 200 Gbytes demonstrate the viability of our approach.
Odysseas Papapetrou, Sukriti Ramesh, Stefan Siersd
Added 29 Jan 2011
Updated 29 Jan 2011
Type Journal
Year 2010
Where P2P
Authors Odysseas Papapetrou, Sukriti Ramesh, Stefan Siersdorfer, Wolfgang Nejdl
Comments (0)