Both structured and unstructured data, as well as structured data representing several different types of tuples, may be integrated into a single list for browsing or retrieval. D...
We present a static index pruning method, to be used in ad-hoc document retrieval tasks, that follows a documentcentric approach to decide whether a posting for a given term shoul...
abstraction for modeling these problems is to view the Web as a collection of (usually small and heterogeneous) databases, and to view programs that extract and process Web data au...
This paper proposes a demo of the TopX search engine, an extensive framework for unified indexing, querying, and ranking of large collections of unstructured, semistructured, and ...
Abstract—A text filtering system monitors a stream of incoming documents, to identify those that match the interest profiles of its users. The user interests are registered at ...