Ranking a set of retrieved documents according to their relevance to a given query has become a popular problem at the intersection of web search, machine learning, and informatio...
Web browser history detection using CSS visited styles has long been dismissed as an issue of marginal impact. However, due to recent changes in Web usage patterns, coupled with br...
We have developed a web-repository crawler that is used for reconstructing websites when backups are unavailable. Our crawler retrieves web resources from the Internet Archive, Go...
This paper develops a general, formal framework for modeling term dependencies via Markov random fields. The model allows for arbitrary text features to be incorporated as eviden...
Delta compression techniques are commonly used to succinctly represent an updated version of a file with respect to an earlier one. In this paper, we study the use of delta compr...
Zan Ouyang, Nasir D. Memon, Torsten Suel, Dimitre ...