Sciweavers

ICDAR
2003
IEEE

A Constraint-based Approach to Table Structure Derivation

13 years 9 months ago
A Constraint-based Approach to Table Structure Derivation
er presents an approach to deriving an abstract geometric model of a table from a physical representation. The technique developed uses a graph of constraints between cells which must be satisfied in order to determine their relative horizontal and vertical position. The method is evaluated with a test set of tables drawn from US Securities and Exchange Commission (SEC) filings. 1 Problem Description Given a flat, textual representation of a table, we wish to abstract geometric model that identifies the relative location of cells and captures their textual content. For example, consider the ASCII table shown in Fig. 1, we derive the abstract geometric model represented by XML of the form CELL X0="1" Y0="0" X1="3" Y1="0" YEAR ENDED DECEMBER 31, /CELL This XML description may then be used to deliver the appropriate HTML version of the table, or as input to further high-level applications such as an information extraction systems. The process w...
Matthew Hurst
Added 04 Jul 2010
Updated 04 Jul 2010
Type Conference
Year 2003
Where ICDAR
Authors Matthew Hurst
Comments (0)