Standard algorithms for template-based information extraction (IE) require predefined template schemas, and often labeled data, to learn to extract their slot fillers (e.g., an ...
Graphical components information extraction is a crucial step in the chart recognition and understanding process. However, existing methods of information extraction from chart im...
This paper presents a morphological scale-space approach to the problem of feature extraction. The method relies on two steps: a hierarchical simplification step based on pyramids...
Text documents often embed data that is structured in nature, and we can expose this structured data using information extraction technology. By processing a text database with inf...
This paper studies structured data extraction from Web pages, e.g., online product description pages. Existing approaches to data extraction include wrapper induction and automatic...