Sciweavers

ICDAR
2011
IEEE

A Table Detection Method for Multipage PDF Documents via Visual Seperators and Tabular Structures

12 years 4 months ago
A Table Detection Method for Multipage PDF Documents via Visual Seperators and Tabular Structures
—Table detection is always an important task of document analysis and recognition. In this paper, we propose a novel and effective table detection method via visual separators and geometric content layout information, targeting at PDF documents. The visual separators refer to not only the graphic ruling lines but also the white spaces to handle tables with or without ruling lines. Furthermore, we detect page columns in order to assist table region delimitation in complex layout pages. Evaluations of our algorithm on an e-Book dataset and a scientific document dataset show competitive performance. It is noteworthy that the proposed method has been successfully incorporated into a commercial software package for largescale Chinese e-Book production. Keywords- table detection; table spotting; PDF documents; separators; ruling lines
Jing Fang, Liangcai Gao, Kun Bai, Ruiheng Qiu, Xin
Added 24 Dec 2011
Updated 24 Dec 2011
Type Journal
Year 2011
Where ICDAR
Authors Jing Fang, Liangcai Gao, Kun Bai, Ruiheng Qiu, Xin Tao, Zhi Tang
Comments (0)