Vocabulary incompatibilities arise when the terms used to index a document collection are largely unknown, or at least not well-known to the users who eventually search the collec...
James C. French, Allison L. Powell, Fredric C. Gey...
Delta compression techniques are commonly used to succinctly represent an updated version of a file with respect to an earlier one. In this paper, we study the use of delta compr...
Zan Ouyang, Nasir D. Memon, Torsten Suel, Dimitre ...
We study the problem of maintaining large replicated collections of files or documents in a distributed environment with limited bandwidth. This problem arises in a number of impo...
Current web search engines focus on searching only the most recent snapshot of the web. In some cases, however, it would be desirable to search over collections that include many ...
The aim of latent semantic indexing (LSI) is to uncover the relationships between terms, hidden concepts, and documents. LSI uses the matrix factorization technique known as singu...