We consider the problem of modeling the content structure of texts within a specific domain, in terms of the topics the texts address and the order in which these topics appear. W...
This paper presents a new approach for the binarization of seriously degraded manuscript. We introduce a new technique based on a Markov Random Field (MRF) model of the document. ...
A new approach to the Text Categorization problem is here presented. It is called Gaussian Weighting and it is a supervised learning algorithm that, during the training phase, est...
Abstract. Dealing with heterogeneous ontologies by means of semantic mappings has become an important area of research and a number of systems for discovering mappings between onto...
It is difficult to apply machine learning to new domains because often we lack labeled problem instances. In this paper, we provide a solution to this problem that leverages domai...