Sciweavers

2 search results - page 1 / 1
» Certification and Cleaning up of a Text Corpus: Towards an E...
Sort
View
LREC
2008
77views Education» more  LREC 2008»
13 years 6 months ago
Certification and Cleaning up of a Text Corpus: Towards an Evaluation of the "Grammatical" Quality of a Corpus
We present in this article the methods we used for obtaining measures to ensure the quality and well-formedness of a text corpus. These measures allow us to determine the compatib...
Cyril Grouin
CICLING
2008
Springer
13 years 6 months ago
Non-interactive OCR Post-correction for Giga-Scale Digitization Projects
This paper proposes a non-interactive system for reducing the level of OCR-induced typographical variation in large text collections, contemporary and historical. Text-Induced Corp...
Martin Reynaert