The Google search engine uses a method called PageRank, together with term-based and other ranking techniques, to order search results returned to the user. PageRank uses link ana...
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...
Many ranking models have been proposed in information retrieval, and recently machine learning techniques have also been applied to ranking model construction. Most of the existin...
Xiubo Geng, Tie-Yan Liu, Tao Qin, Andrew Arnold, H...
Efficient similarity search in high-dimensional spaces is important to content-based retrieval systems. Recent studies have shown that sketches can effectively approximate L1 dist...
We consider a network of autonomous peers forming a logically global but physically distributed search engine, where every peer has its own local collection generated by independe...
Josiane Xavier Parreira, Sebastian Michel, Gerhard...