In this paper we study in what order a crawler should visit the URLs it has seen, in order to obtain more "important" pages first. Obtaining important pages rapidly can ...
An operating system’s readahead and buffer-cache behaviors can significantly impact application performance; most often these better performance, but occasionally they worsen it...
Abstract. In this paper, we examine a number of newly applied methods for combining pre-retrieval query performance predictors in order to obtain a better prediction of the query...
Several efforts have been made over the years for developing file systems in user space. Many of these efforts have failed to make a significant impact as measured by their us...
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...