Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

15

DAS
2010
Springer

favoriteEmaildiscussreport

186views Document Analysis» more DAS 2010»

Text extraction from graphical document images using sparse representation

13 years 8 months ago

Text extraction from graphical document images using sparse representation

Download hal.inria.fr

A novel text extraction method from graphical document images is presented in this paper. Graphical document images containing text and graphics components are considered as two-dimensional signals by which text and graphics have diﬀerent morphological characteristics. The proposed algorithm relies upon a sparse representation framework with two appropriately chosen discriminative overcomplete dictionaries, each one gives sparse representation over one type of signal and non-sparse representation over the other. Separation of text and graphics components is obtained by promoting sparse representation of input images in these two dictionaries. Some heuristic rules are used for grouping text components into text strings in post-processing steps. The proposed method overcomes the problem of touching between text and graphics. Preliminary experiments show some promising results on diﬀerent types of document. Categories and Subject Descriptors I.4.6 [Image Processing and Computer Visio...

Thai V. Hoang, Salvatore Tabbone

Real-time Traffic

DAS 2010 | Document Analysis | Graphical Document Images | Sparse Representation | Text Component |

claim paper

Related Content

» Text detection from scene images using sparse representation

» TextGraphics Separation using Agentbased Pyramid Operations

» TextGraphic labelling of Ancient Printed Documents

» Enhanced Text Extraction from Arabic Degraded Document Images Using EM Algorithm

» Text Extraction from Gray Scale Historical Document Images Using Adaptive Local Connectivi...

» An Approach to Extracting the Target Text Line from a Document Image Captured by a Pen Sca...

» From sentence to emotion a realtime threedimensional graphics metaphor of emotions extract...

» Radon Transform for Lineal Symbol Representation

» Learning an enriched representation from unlabeled data for proteinprotein interaction ext...

Post Info
More Details (n/a)

Added	24 Aug 2010
Updated	24 Aug 2010
Type	Conference
Year	2010
Where	DAS
Authors	Thai V. Hoang, Salvatore Tabbone

Comments (0)