Cheap and versatile cameras make it possible to easily and quickly capture a wide variety of documents. However, low resolution cameras present a challenge to OCR because it is vi...
Charles E. Jacobs, Patrice Y. Simard, Paul A. Viol...
In a text categorization task, classification on some hierarchy of classes shows better results than the case without the hierarchy. In current environments where large amount of ...
Bug triage, deciding what to do with an incoming bug report, is taking up increasing amount of developer resources in large open-source projects. In this paper, we propose to appl...
When recognizing multiple fonts, geometric features, such as the directional information of strokes, are generally robust against deformation but are weak against degradation. Thi...
Abstract. Parallel corpora are a valuable resource for machine translation, but at present their availability and utility is limited by genreand domain-speci city, licensing restri...