The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several...
Text classification in the medical domain is a real world problem with wide applicability. This paper investigates extensively the effect of text representation approaches on the p...
Fathi H. Saad, Beatriz de la Iglesia, Duncan G. Be...
Capturing knowledge from free-form evaluative texts about an entity is a challenging task. New techniques of feature extraction, polarity determination and strength evaluation hav...
In this paper, we present a fast and scalable Bayesian model for improving weakly annotated data – which is typically generated by a (semi) automated information extraction (IE) ...
In OCR systems the character segmentation algorithm may generate mis-segmented blocks. Feedback information from character classifier is indispensable to achieve higher character ...