With an aim to extract the structural information from the table of contents (TOC) to help develop digital document library the requirement of identifying/segmenting the TOC page ...
S. Mandal, S. P. Chowdhury, Amit Kumar Das, Bhabat...
We describe a component of a document analysis system for constructing ontologies for domain-specific web tables imported into Excel. This component automates extraction of the Wa...
Sharad C. Seth, Ramana Chakradhar Jandhyala, Mukka...
Table of contents (TOC) recognition has attracted a great deal of attention in recent years. After reviewing the merits and drawbacks of the existing TOC recognition methods, we h...
Tables are a ubiquitous form of communication. While everyone seems to know what a table is, a precise, analytical definition of "tabularity" remains elusive because some...
David W. Embley, Matthew Hurst, Daniel P. Lopresti...
— We present a general approach for the hierarchical segmentation and labeling of document layout structures. This approach models document layout as a grammar and performs a glo...