Sciweavers

COLING
2000

Mining Tables from Large Scale HTML Texts

13 years 4 months ago
Mining Tables from Large Scale HTML Texts
Table is a very common presentation scheme, but few papers touch on table extraction in text data mining. This paper focuses on mining tables from large-scale HTML texts. Table filtering, recognition, interpretation, and presentation are discussed. Heuristic rules and cell similarities are employed to identify tables. The F-measure of table recognition is 86.50%. We also propose an algorithm to capture attribute-value relationships among table cells. Finally, more structured data is extracted and presented.
Hsin-Hsi Chen, Shih-Chung Tsai, Jin-He Tsai
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where COLING
Authors Hsin-Hsi Chen, Shih-Chung Tsai, Jin-He Tsai
Comments (0)