Combating Web spam has become one of the top challenges for Web search engines. State-of-the-art spam detection techniques are usually designed for specific known types of Web spa...
Yiqun Liu, Rongwei Cen, Min Zhang, Shaoping Ma, Li...
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. Few users wish to retri...
The retrieval of similar documents from large scale datasets has been the one of the main concerns in knowledge management environments, such as plagiarism detection, news impact a...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...
: A fully operational large scale digital library is likely to be based on a distributed architecture and because of this it is likely that a number of independent search engines m...
The massive amount of near-duplicate and duplicate web videos has presented both challenge and opportunity to multimedia computing. On one hand, browsing videos on Internet become...