Sciweavers

CIVR
2007
Springer

Detection of near-duplicate images for web search

13 years 10 months ago
Detection of near-duplicate images for web search
Among the vast numbers of images on the web are many duplicates and near-duplicates, that is, variants derived from the same original image. Such near-duplicates appear in many web image searches and may represent infringements of copyright or indicate the presence of redundancy. While methods for identifying near-duplicates have been investigated, there has been no analysis of the kinds of alterations that are common on the web or evaluation of whether real cases of near-duplication can in fact be identified. In this paper we use popular queries and a commercial image search service to collect images that we then manually analyse for instances of near-duplication. We show that such duplication is indeed significant, but that not all kinds of image alteration explored in previous literature are evident in web data. Removal of near-duplicates from a collection is impractical, but we propose that they be removed from sets of answers. We evaluate our technique for automatic identifica...
Jun Jie Foo, Justin Zobel, Ranjan Sinha, Seyed M.
Added 07 Jun 2010
Updated 07 Jun 2010
Type Conference
Year 2007
Where CIVR
Authors Jun Jie Foo, Justin Zobel, Ranjan Sinha, Seyed M. M. Tahaghoghi
Comments (0)