Sciweavers

5 search results - page 1 / 1
» Efficient automatic OCR word validation using word partial f...
Sort
View
DRR
2010
13 years 7 months ago
Efficient automatic OCR word validation using word partial format derivation and language model
In this paper we present an OCR validation module, implemented for the System for Preservation of Electronic Resources (SPER) developed at the U.S. National Library of Medicine.1 ...
Siyuan Chen, Dharitri Misra, George R. Thoma
ACL
2008
13 years 6 months ago
Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation
In statistical language modeling, one technique to reduce the problematic effects of data sparsity is to partition the vocabulary into equivalence classes. In this paper we invest...
Jakob Uszkoreit, Thorsten Brants
GECCO
2003
Springer
100views Optimization» more  GECCO 2003»
13 years 10 months ago
Studying the Advantages of a Messy Evolutionary Algorithm for Natural Language Tagging
The process of labeling each word in a sentence with one of its lexical categories (noun, verb, etc) is called tagging and is a key step in parsing and many other language processi...
Lourdes Araujo
CICLING
2009
Springer
13 years 9 months ago
Language Identification on the Web: Extending the Dictionary Method
Abstract. Automated language identification of written text is a wellestablished research domain that has received considerable attention in the past. By now, efficient and effecti...
Radim Rehurek, Milan Kolkus
ACL
2008
13 years 6 months ago
Unsupervised Learning of Acoustic Sub-word Units
Accurate unsupervised learning of phonemes of a language directly from speech is demonstrated via an algorithm for joint unsupervised learning of the topology and parameters of a ...
Balakrishnan Varadarajan, Sanjeev Khudanpur, Emman...