Assessing semantic similarity between text documents is a crucial aspect in Information Retrieval systems. In this work, we propose to use hyperlink information to derive a simila...
Text categorization involves mapping of documents to a fixed set of labels. A similar but equally important problem is that of assigning labels to large corpora. With a deluge of ...
We present a first known result of high precision rare word bilingual extraction from comparable corpora, using aligned comparable documents and supervised classification. We in...
In this paper, we introduce an analysis of the requirements and design choices for hands-free documentation. Hands-busy tasks such as cooking or car repair may require substantial...
Abstract. A paper document processing system is an information system component which transforms information on printed or handwritten documents into a computer-revisable form. In ...
Floriana Esposito, Donato Malerba, Francesca A. Li...