This paper describes the character recognition process from printed documents containing Hindi and Telugu text. Hindi and Telugu are among the most popular languages in India. The...
C. V. Jawahar, M. N. S. S. K. Pavan Kumar, S. S. R...
The problem of document categorization is considered. The set of domains and the keywords specific for these domains is supposed to be selected beforehand as initial data. We apply...
Mikhail Alexandrov, Alexander F. Gelbukh, George L...
We report about the current state of development of a document suite and its applications. This collection of tools for the flexible and robust processing of documents in German i...
In many criminal cases, forensically collected data contain valuable information about a suspect’s social networks. An investigator often has to manually extract information fro...
Rabeah Al-Zaidy, Benjamin C. M. Fung, Amr M. Youss...
We consider an interactive information retrieval task in which the user is interested in finding several to many relevant documents with minimal effort. Given an initial documen...