Sciweavers

DAS
2010
Springer

Analysis and taxonomy of column header categories for web tables

13 years 7 months ago
Analysis and taxonomy of column header categories for web tables
We describe a component of a document analysis system for constructing ontologies for domain-specific web tables imported into Excel. This component automates extraction of the Wang Notation for the column header of a table. Using column-header specific rules for XY cutting we convert the geometric structure of the column header to a linear string denoting cell attributes and directions of cuts. The string representation is parsed by a contextfree grammar and the parse tree is further processed to produce an data-type representation (the Wang notation tree) of each column category. Experiments were carried out to evaluate this scheme on the original and edited column headers of Excel tables drawn from a collection of 200 used in our earlier work. The transformed headers were obtained by editing the original column headers to conform to the format targeted by our grammar. Fortyfour original headers and their reformatted versions were submitted as input to our software system. Our gramm...
Sharad C. Seth, Ramana Chakradhar Jandhyala, Mukka
Added 02 Sep 2010
Updated 02 Sep 2010
Type Conference
Year 2010
Where DAS
Authors Sharad C. Seth, Ramana Chakradhar Jandhyala, Mukkai S. Krishnamoorthy, George Nagy
Comments (0)