We investigate three issues in distributed information retrieval, considering both TREC data and U.S. Patents: (1) topical organization of large text collections, (2) collection r...
Leah S. Larkey, Margaret E. Connell, James P. Call...
The detection of new information in a document stream is an important component of many potential applications. In this work, a new novelty detection approach based on the identif...
This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevan...
In this paper, we study the problem of learning block classification models to estimate block functions. We distinguish general models, which are learned across multiple sites, an...
Although PageRank has been designed to estimate the popularity of Web pages, it is a general algorithm that can be applied to the analysis of other graphs other than one of hypert...