The exponential growth of data demands scalable infrastructures capable of indexing and searching rich content such as text, music, and images. A promising direction is to combine...
Current e-Book browsers provide minimal support for comprehending the organization, narrative structure, and themes, of large complex books. In order to build an understanding of ...
Document-centric XML collections contain text-rich documents, marked up with XML tags. The tags add lightweight semantics to the text. Querying such collections calls for a hybrid...
Web-page classification is much more difficult than pure-text classification due to a large variety of noisy information embedded in Web pages. In this paper, we propose a new Web...
Although information retrieval research has always been concerned with improving the effectiveness of search, in some applications, such as information analysis, a more specific ...
In this paper, we describe research which could lead to a novel approach to gathering an overview of a document in a foreign language. The research explores how much of the meanin...
Forming test collection relevance judgments from the pooled output of multiple retrieval systems has become the standard process for creating resources such as the TREC, CLEF, and...