Sciweavers

47 search results - page 2 / 10
» Accuracy Improvement and Objective Evaluation of Annotation ...
Sort
View
NAACL
2010
13 years 3 months ago
Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment
The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several...
Jason R. Smith, Chris Quirk, Kristina Toutanova
DMIN
2006
150views Data Mining» more  DMIN 2006»
13 years 6 months ago
Effect of Document Representation on the Performance of Medical Document Classification
Text classification in the medical domain is a real world problem with wide applicability. This paper investigates extensively the effect of text representation approaches on the p...
Fathi H. Saad, Beatriz de la Iglesia, Duncan G. Be...
KCAP
2005
ACM
13 years 10 months ago
Extracting knowledge from evaluative text
Capturing knowledge from free-form evaluative texts about an entity is a challenging task. New techniques of feature extraction, polarity determination and strength evaluation hav...
Giuseppe Carenini, Raymond T. Ng, Ed Zwart
ICWE
2007
Springer
13 years 11 months ago
Fixing Weakly Annotated Web Data Using Relational Models
In this paper, we present a fast and scalable Bayesian model for improving weakly annotated data – which is typically generated by a (semi) automated information extraction (IE) ...
Fatih Gelgi, Srinivas Vadrevu, Hasan Davulcu
ICDAR
2003
IEEE
13 years 10 months ago
Rejection Algorithm for Mis-segmented Characters In Multilingual Document Recognition
In OCR systems the character segmentation algorithm may generate mis-segmented blocks. Feedback information from character classifier is indispensable to achieve higher character ...
Zhengang Chen, Xiaoqing Ding