Sciweavers

87 search results - page 1 / 18
» Document zone content classification and its performance eva...
Sort
View
PR
2006
84views more  PR 2006»
13 years 4 months ago
Document zone content classification and its performance evaluation
This paper describes an algorithm for the determination of zone content type of a given zone within a document image. We take a statistical based approach and represent each zone ...
Yalin Wang, Ihsin T. Phillips, Robert M. Haralick
JCDL
2006
ACM
167views Education» more  JCDL 2006»
13 years 10 months ago
Combining DOM tree and geometric layout analysis for online medical journal article segmentation
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
Jie Zou, Daniel X. Le, George R. Thoma
AUSDM
2008
Springer
243views Data Mining» more  AUSDM 2008»
13 years 6 months ago
Structure-Based Document Model with Discrete Wavelet Transforms and Its Application to Document Classification
Term signal is an existing text representation that depicts a term as a vector of frequencies of occurrences in a number of user-defined partitions of a document. Although term si...
Supphachai Thaicharoen, Tom Altman, Krzysztof J. C...
DAS
2010
Springer
13 years 4 months ago
Context-aware and content-based dynamic Voronoi page segmentation
This paper presents a dynamic approach to document page segmentation based on inter-component relationships and their local features. State-of-the art page segmentation algorithms...
Mudit Agrawal, David S. Doermann
LREC
2008
141views Education» more  LREC 2008»
13 years 6 months ago
New Resources for Document Classification, Analysis and Translation Technologies
The goal of the DARPA MADCAT (Multilingual Automatic Document Classification Analysis and Translation) Program is to automatically convert foreign language text images into Englis...
Stephanie Strassel, Lauren Friedman, Safa Ismael, ...