Table is a commonly used presentation scheme, especially for describing relational information. However, table understanding remains an open problem. In this paper, we consider th...
Optimisation of real world Variable Data printing (VDP) documents is a difficult problem because the interdependencies between layout functions may drastically reduce the number o...
Alexander J. Macdonald, David F. Brailsford, Steve...
In this article we present a method for retrieving documents from a digital library through a visual interface based on automatically generated concepts. We used a vocabulary gene...
Table of contents (TOC) recognition has attracted a great deal of attention in recent years. After reviewing the merits and drawbacks of the existing TOC recognition methods, we h...
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...