We present a Question-Answering (QA) system for Portuguese juridical documents. The QA system was applied to the complete set of decisions from several Portuguese juridical instit...
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
We present methods for eliminating or reducing the distortion in a scanned image. Aspects of the present paper allow for the automatic pruning, de-skewing, and unwarping of an ima...
Documents in a wide range of genres often contain references to their own sections, pictures etc. We call such referring expressions instances of Document Deixis. The present work ...
Annotating the regions, text lines and characters of document images is an important, but tedious and expensive task. A ground-truthing tool may largely alleviate the human burden...