In a federated digital library system, it is too expensive to query every accessible library. Resource selection is the task to decide to which libraries a query should be routed....
This paper presents a novel domain-independent text segmentation method, which identifies the boundaries of topic changes in long text documents and/or text streams. The method c...
Translation of proper names is generally recognized as a significant problem in many multi-lingual text and speech processing applications. Even when large bilingual lexicons use...
Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labo...
The ability to find tables and extract information from them is a necessary component of data mining, question answering, and other information retrieval tasks. Documents often c...
David Pinto, Andrew McCallum, Xing Wei, W. Bruce C...