Sciweavers

637 search results - page 36 / 128
» Training and documentation
Sort
View
ECIR
2008
Springer
14 years 11 months ago
Semi-supervised Document Classification with a Mislabeling Error Model
Abstract. This paper investigates a new extension of the Probabilistic Latent Semantic Analysis (PLSA) model [6] for text classification where the training set is partially labeled...
Anastasia Krithara, Massih-Reza Amini, Jean-Michel...
111
Voted
ICDAR
2003
IEEE
15 years 3 months ago
Word Segmentation of Handwritten Dates in Historical Documents by Combining Semantic A-Priori-Knowledge with Local Features
The recognition of script in historical documents requires suitable techniques in order to identify single words. Segmentation of lines and words is a challenging task because lin...
Markus Feldbach, Klaus D. Tönnies
WIDM
2003
ACM
15 years 2 months ago
Clustering documents in a web directory
Hierarchical categorization of documents is a task receiving growing interest due to the widespread proliferation of topic hierarchies for text documents. The worst problem of hie...
Giordano Adami, Paolo Avesani, Diego Sona
ICDAR
2005
IEEE
15 years 3 months ago
Language Identification of Character Images Using Machine Learning Techniques
In this paper, we propose a new approach for identifying the language type of character images. We do this by classifying individual character images to determine the language bou...
Ying-Ho Liu, Fu Chang, Chin-Chin Lin
72
Voted
SAC
2010
ACM
15 years 4 months ago
Enhancing document structure analysis using visual analytics
During the last decade national archives, libraries, museums and companies started to make their records, books and files electronically available. In order to allow efficient ac...
Andreas Stoffel, David Spretke, Henrik Kinnemann, ...