Traditional interactive information retrieval systems function by creating inverted lists, or term indexes. For every term in the vocabulary, a list is created that contains the d...
: Mass digitization of document collections with further processing and semantic annotation is an increasing activity among libraries and archives at large for preservation, browsi...
This paper presents a new document image binarization technique that segments the text from badly degraded historical document images. The proposed technique makes use of the imag...
Abstract--Exchanging structured business documents is inevitable for successful collaboration in electronic commerce. A prerequisite, for fostering the interoperability between bus...
Christian Pichler, Michael Strommer, Christian Hue...
In this paper we propose the multirelational topic model (MRTM) for multiple types of link modeling such as citation and coauthor links in document networks. In the citation networ...
Jia Zeng, William K. Cheung, Chun-hung Li, Jiming ...