A new handwritten text database, GERMANA, is presented to facilitate empirical comparison of different approaches to text line extraction and off-line handwriting recognition. G...
This paper presents a multi-domain information extraction system. The overall architecture of the system is detailed. A set of machine learning tools helps the expert to explore t...
Previous content extraction evaluations have neglected to address problems which complicate the incorporation of extracted information into an existing knowledge base. Previous qu...
Paul McNamee, Hoa Trang Dang, Heather Simpson, Pat...
We propose a new text mining system which extracts characteristic contents from given documents. We define Key semantics as characteristic sub-structures of syntactic dependencie...
Background: The exploitation of information extraction (IE), a technology aiming to provide instances of structured representations from free-form text, has been rapidly growing w...
Tuangthong Wattarujeekrit, Parantu K. Shah, Nigel ...