We present a methodology that takes as input scanned documents of typed or hand-written text, and produces transcriptions of the text as output. Instead of using OCR technology, t...
Two different systems are proposed for the task of capitalisation generation. The first system is a slightly modified speech recogniser. In this system, every word in the vocabula...
This paper describes a system for processing economic documents written in the ancient Sumerian language. The system is application-oriented and takes advantage of the simplicity ...
Cross-language Text Categorization is the task of assigning semantic classes to documents written in a target language (e.g. English) while the system is trained using labeled doc...
Text classification is one of the most actual among the natural language processing problems. In this paper the application of word-based PPM (Prediction by Partial Matching) mode...