Sciweavers

19 search results - page 2 / 4
» Text Separation from Mixed Documents Using a Tree-Structured...
Sort
View
SMC
2010
IEEE
186views Control Systems» more  SMC 2010»
13 years 3 months ago
Semantic enrichment of text representation with wikipedia for text classification
—Text classification is a widely studied topic in the area of machine learning. A number of techniques have been developed to represent and classify text documents. Most of the t...
Hiroki Yamakawa, Jing Peng, Anna Feldman
ICDAR
2005
IEEE
13 years 10 months ago
Distinguishing Mathematics Notation from English Text using Computational Geometry
A trainable method for distinguishing between mathematics notation and natural language (here, English) in images of textlines, using computational geometry methods only with no a...
Derek M. Drake, Henry S. Baird
SDM
2004
SIAM
174views Data Mining» more  SDM 2004»
13 years 6 months ago
Classifying Documents Without Labels
Automatic classification of documents is an important area of research with many applications in the fields of document searching, forensics and others. Methods to perform classif...
Daniel Barbará, Carlotta Domeniconi, Ning K...
ICDAR
2011
IEEE
12 years 4 months ago
Localization of Digit Strings in Farsi/Arabic Document Images Using Structural Features and Syntactical Analysis
—This paper presents a new method for localization of digit strings with a specific syntax in Farsi/ Arabic document images. First, some features are extracted from all connected...
Ali Abedi, Karim Faez
ICDAR
2007
IEEE
13 years 8 months ago
Iterated Document Content Classification
We report an improved methodology for training classifiers for document image content extraction, that is, the location and segmentation of regions containing handwriting, machine...
Chang An, Henry S. Baird, Pingping Xiu