Complex objects can often be conveniently represented by finite sets of simpler components, such as images by sets of patches or texts by bags of words. We study the class of posi...
Any large language processing software relies in its operation on heuristic decisions concerning the strategy of processing. These decisions are usually "hard-wired" int...
This paper addresses the alignment issue in the framework of exploitation of large bimultilingual corpora for translation purposes. A generic alignment scheme is proposed that can...
One of the most interesting issues in the field of cultural heritage is the adoption of multimedia systems for the visualization and organization of information. In this paper we ...
Thomas M. Alisi, Gianpaolo D'Amico, Andrea Ferraca...
Chemical named entities represent an important facet of biomedical text. We have developed a system to use character-based ngrams, Maximum Entropy Markov Models and rescoring to r...