The number of patent documents is currently rising rapidly worldwide, creating the need for an automatic categorization system to replace time-consuming and labor-intensive manual...
Using relevance feedback can significantly improve (ad hoc) retrieval effectiveness. Yet, if little feedback is available, effectively exploiting it is a challenge. To that end,...
: We propose a method for text retrieval from document images without the use of OCR. Documents are segmented into character objects. Image features, namely the Vertical Traverse D...
Translingual information retrieval (TLIR) consists of providing a query in one language and searching document collections in one or more di erent languages. This paper introduces...
Yiming Yang, Jaime G. Carbonell, Ralf D. Brown, Ro...
The ability to find tables and extract information from them is a necessary component of data mining, question answering, and other information retrieval tasks. Documents often c...
David Pinto, Andrew McCallum, Xing Wei, W. Bruce C...