Management and retrieval of large volumes of text can be expensive in both space and time. Moreover, the range of document sizes in a large collection such as trec presents difficu...
Alistair Moffat, Ron Sacks-Davis, Ross Wilkinson, ...
We present a document routing and index partitioning scheme for scalable similarity-based search of documents in a large corpus. We consider the case when similarity-based search ...
The web hasgreatly improved accessto scientific literature. However, scientific articles on the web are largely disorganized, with research articles being spreadacrossarchive site...
Electronic mail poses a number of unusual challenges for the design of information retrieval systems and test collections, including informal expression, conversational structure,...
Abstract This paper describes the University of Sheffield entry for the 3rd International Competition on Plagiarism Detection which attempted the monolingual external plagiarism d...
Rao Muhammad Adeel Nawab, Mark Stevenson, Paul D. ...