In this paper, a comparative study of lossless compression algorithms is presented. The following algorithms are considered: UNIX compress, gzip, LZW, CCITT Group 3 and Group 4, J...
We propose an algorithm for the binarization of document images degraded by uneven light distribution, based on the Markov Random Field modeling with Maximum A Posteriori probabil...
We report on the design and implementation of a system which automates the process of capturing structured documents from the optically recognized form of printed materials. The sy...
In this paper we propose a probabilistic model for online document clustering. We use non-parametric Dirichlet process prior to model the growing number of clusters, and use a pri...
Word Sense Disambiguation (WSD), in the field of Natural Language Processing (NLP), consists in assigning the correct sense (semantics) to a word form (lexeme) by means of the cont...
Davide Buscaldi, Giovanna Guerrini, Marco Mesiti, ...