Search Sciweavers | Sciweavers

4 search results - page 1 / 1

» Certification and Cleaning up of a Text Corpus: Towards an E...

click to vote

LREC
2008

77views Education» more LREC 2008»

Certification and Cleaning up of a Text Corpus: Towards an Evaluation of the "Grammatical" Quality of a Corpus

13 years 6 months ago

Download www.lrec-conf.org

We present in this article the methods we used for obtaining measures to ensure the quality and well-formedness of a text corpus. These measures allow us to determine the compatib...

Cyril Grouin

claim paper

Read More »

click to vote

CICLING
2008
Springer

124views Natural Language Processing» more CICLING 2008»

Non-interactive OCR Post-correction for Giga-Scale Digitization Projects

13 years 7 months ago

Download ilk.uvt.nl

This paper proposes a non-interactive system for reducing the level of OCR-induced typographical variation in large text collections, contemporary and historical. Text-Induced Corp...

Martin Reynaert

claim paper

Read More »

click to vote

LREC
2010

237views Education» more LREC 2010»

Entity Mention Detection using a Combination of Redundancy-Driven Classifiers

13 years 6 months ago

Download www.lrec-conf.org

We present an experimental framework for Entity Mention Detection in which two different classifiers are combined to exploit Data Redundancy attained through the annotation of a l...

Silvana Marianela Bernaola Biggio, Manuela Speranz...

claim paper

Read More »

click to vote

ICML
1997
IEEE

210views Machine Learning» more ICML 1997»

A Comparative Study on Feature Selection in Text Categorization

13 years 9 months ago

Download net.pku.edu.cn

This paper is a comparative study of feature selection methods in statistical learning of text categorization. The focus is on aggressive dimensionality reduction. Five methods we...

Yiming Yang, Jan O. Pedersen

claim paper

Read More »

« Prev « First page 1 / 1 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers