Sciweavers

37 search results - page 1 / 8
» Extending Page Segmentation Algorithms for Mixed-Layout Docu...
Sort
View
ICDAR
2011
IEEE
12 years 4 months ago
Extending Page Segmentation Algorithms for Mixed-Layout Document Processing
—The goal of this work is to add the capability to segment documents containing text, graphics, and pictures in the open source OCR engine OCRopus. To achieve this goal, OCRopus...
Amy Winder, Tim L. Andersen, Elisa H. Barney Smith
ICDAR
2003
IEEE
13 years 10 months ago
Evaluating SEE - A Benchmarking System for Document Page Segmentation
The decomposition of a document into segments such as text regions and graphics is a significant part of the document analysis process. The basic requirement for rating and impro...
Stefan Agne, Andreas Dengel, Bertin Klein
CICLING
2009
Springer
13 years 8 months ago
Language Identification on the Web: Extending the Dictionary Method
Abstract. Automated language identification of written text is a wellestablished research domain that has received considerable attention in the past. By now, efficient and effecti...
Radim Rehurek, Milan Kolkus
WWW
2006
ACM
14 years 5 months ago
Using graph matching techniques to wrap data from PDF documents
Wrapping is the process of navigating a data source, semiautomatically extracting data and transforming it into a form suitable for data processing applications. There are current...
Tamir Hassan, Robert Baumgartner
CIKM
1999
Springer
13 years 9 months ago
Word Segmentation and Recognition for Web Document Framework
It is observed that a better approach to Web information understanding is to base on its document framework, which is mainly consisted of (i) the title and the URL name of the pag...
Chi-Hung Chi, Chen Ding, Andrew Lim