Sciweavers

25 search results - page 3 / 5
» Automatic Ground-Truth Generation for Skew-Tolerance Evaluat...
Sort
View
WWW
2002
ACM
14 years 6 months ago
A machine learning based approach for table detection on the web
Table is a commonly used presentation scheme, especially for describing relational information. However, table understanding remains an open problem. In this paper, we consider th...
Yalin Wang, Jianying Hu
DOCENG
2007
ACM
13 years 9 months ago
Speculative document evaluation
Optimisation of real world Variable Data printing (VDP) documents is a difficult problem because the interdependencies between layout functions may drastically reduce the number o...
Alexander J. Macdonald, David F. Brailsford, Steve...
DLIB
2002
263views more  DLIB 2002»
13 years 5 months ago
Information Retrieval by Semantic Analysis and Visualization of the Concept Space of D-Lib Magazine
In this article we present a method for retrieving documents from a digital library through a visual interface based on automatically generated concepts. We used a vocabulary gene...
Junliang Zhang, Javed Mostafa, Himansu Tripathy
ICDAR
2009
IEEE
13 years 3 months ago
Analysis of Book Documents' Table of Content Based on Clustering
Table of contents (TOC) recognition has attracted a great deal of attention in recent years. After reviewing the merits and drawbacks of the existing TOC recognition methods, we h...
Liangcai Gao, Zhi Tang, Xiaofan Lin, Xin Tao, Yimi...
JCDL
2006
ACM
167views Education» more  JCDL 2006»
13 years 11 months ago
Combining DOM tree and geometric layout analysis for online medical journal article segmentation
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
Jie Zou, Daniel X. Le, George R. Thoma