Abstract. This paper describes the challenges for document image analysis community for building large digital libraries with diverse document categories.Thechallengesareidentified...
K. Pramod Sankar, Vamshi Ambati, Lakshmi Pratha, C...
Keyphrases provide semantic metadata that summarize and characterize documents. This paper describes Kea, an algorithm for automatically extracting keyphrases from text. Kea ident...
Ian H. Witten, Gordon W. Paynter, Eibe Frank, Carl...
: For many readers, handling a physical book is an enjoyably exquisite part of the information seeking process. Many physical characteristics of a book—its size, heft, the patina...
Yi-Chun Chu, David Bainbridge, Matt Jones, Ian H. ...
We present here a method for automatically projecting structural information across translations, including canonical citation structure (such as chapters and sections), speaker i...
We describe here a method for automatically identifying word sense variation in a dated collection of historical books in a large digital library. By leveraging a small set of kno...