Search Sciweavers | Sciweavers

93 search results - page 16 / 19

» Enhanced word clustering for hierarchical text classificatio...

click to vote

DAS
2008
Springer

181views Document Analysis» more DAS 2008»

A Complete Optical Character Recognition Methodology for Historical Documents

15 years 1 months ago

Download users.iit.demokritos.gr

In this paper a complete OCR methodology for recognizing historical documents, either printed or handwritten without any knowledge of the font, is presented. This methodology cons...

Georgios Vamvakas, Basilios Gatos, Nikolaos Stamat...

claim paper

Read More »

click to vote

EMNLP
2010

124views Natural Language Processing» more EMNLP 2010»

Evaluating Models of Latent Document Semantics in the Presence of OCR Errors

14 years 9 months ago

Download www.aclweb.org

Models of latent document semantics such as the mixture of multinomials model and Latent Dirichlet Allocation have received substantial attention for their ability to discover top...

Daniel David Walker, William B. Lund, Eric K. Ring...

claim paper

Read More »

click to vote

SIGIR
2008
ACM

100views Information Technology» more SIGIR 2008»

Optical character recognition errors and their effects on natural language processing

14 years 11 months ago

Download www.cse.lehigh.edu

Errors are unavoidable in advanced computer vision applications such as optical character recognition, and the noise induced by these errors presents a serious challenge to downstr...

Daniel P. Lopresti

claim paper

Read More »

click to vote

ACL
1994

120views Computational Linguistics» more ACL 1994»

A Corpus-Based Approach to Automatic Compound Extraction

15 years 29 days ago

Download www.mt-archive.info

An automatic compound retrieval method is proposed to extract compounds within a text message. It uses n-gram mutual information, relative frequency count and parts of speech as t...

Keh-Yih Su, Ming-Wen Wu, Jing-Shin Chang

claim paper

Read More »

Voted

TREC
2007

123views Information Technology» more TREC 2007»

WIM at TREC 2007

15 years 23 days ago

Download trec.nist.gov

This paper introduced the four tracks that WIM-Lab Fudan University had taken part in at TREC 2007. For spam track, a multi-centre model was proposed considering the characteristi...

Jun Xu, Jing Yao, Jiaqian Zheng, Qi Sun, Junyu Niu

claim paper

Read More »

« Prev « First page 16 / 19 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers