Automatic table layout is required in web applications. Unfortunately, this is NP-hard for reasonable layout requirements such as minimizing table height for a given width. One ap...
Semantic Web technologies offer the possibility of increased accuracy and completeness in search and retrieval operations. In recent years, curators of data resources have begun f...
Kevin L. Garwood, Phillip W. Lord, Helen Parkinson...
The goal to produce effective Optical Character Recognition (OCR) methods has lead to the development of a number of algorithms. The purpose of these is to take the hand-written o...
Complex documents stored in a flat or partially marked up file format require layout sensitive preprocessing before any natural language processing can be carried out on their tex...
Classification of documents by genre is typically done either using linguistic analysis or term frequency based techniques. The former provides better classification accuracy than...