—In this paper we present a scalable and distributed system for image retrieval based on visual features and annotated text. This system is the core of the SAPIR project. Its arc...
Abstract. Since current search engines employ link-based ranking algorithms as an important tool to decide a ranking of sites, Web spammers are making a significant effort to man...
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
Popular content in video sharing web sites (e.g., YouTube) is usually duplicated. Most scholars define near-duplicate video clips (NDVC) based on non-semantic features (e.g., di...
Mauro Cherubini, Rodrigo de Oliveira, Nuria Oliver
Although content-based image retrieval has been studied for decades, most commercial image search engines are still text-based. However, there is a growing demand for techniques to...