Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

20

AND
2010

favoriteEmaildiscussreport

229views Machine Learning» more AND 2010»

Document: a useful level for facing noisy data

13 years 2 months ago

Document: a useful level for facing noisy data

Download www.xrce.xerox.com

In this paper we will present a set of experiments using large digitalized collections of books to show that logical structures can be extracted with good quality when working at document level. The proposed solution relies on a twofold method: first specific logical elements are recognized by a given method. Then models for the recognized elements are generated by combining layout, content and labeling information. These inferred models combining several kinds of information are used to correct noisy data, typical zoning, OCR and labeling errors produced by previous processing steps. This method is illustrated with the extraction of page numbers and chapter headings, two navigating elements required by digital libraries. Categories and Subject Descriptors I.7.5 [Document and Text Processing]: Document Capture Optical character recognition (OCR) - Document analysis General Terms Experimentation Keywords Logical Analysis, error correction, model

Hervé Déjean, Jean-Luc Meunier

Real-time Traffic

AND 2010 | Document | Keywords Logical Analysis | Large Digitalized Collections | Machine Learning |

claim paper

Related Content

» A platform for storing visualizing and interpreting collections of noisy documents

» Score Level Fusion of Ear and Face Local 3D Features for Fast and ExpressionInvariant Huma...

» A ScoreLevel Fusion Benchmark Database for Biometric Authentication

» Using Dirichlet Free Form Deformation to Fit Deformable Models to Noisy 3D Data

» Sampling highquality clicks from noisy click data

» ModelBased Face Tracking for Dense Motion Field Estimation

» Noisy data and impulse response estimation

» Intrasite Level Cultural Heritage Documentation Combination of Survey Modeling and Imagery...

» Finding Clusters of Different Sizes Shapes and Densities in Noisy High Dimensional Data

Post Info
More Details (n/a)

Added	10 Feb 2011
Updated	10 Feb 2011
Type	Journal
Year	2010
Where	AND
Authors	Hervé Déjean, Jean-Luc Meunier

Comments (0)