This paper presents an Italic/Roman word type recognition system without a priori knowledge on the characters' font. This method aims at analyzing old documents in which char...
We present a corpus{based approach to word{sense disambiguation that only requires information that can be automatically extracted from untagged text. We use unsupervised techniqu...
In this paper, we extend previous work on the automatic structuring of medical documents using content analysis. Our long-term objective is to take advantage of specific rhetoric ...
Gersende Georg, Hugo Hernault, Marc Cavazza, Helmu...
Text classification systems on biomedical literature aim to select relevant articles to a specific issue from large corpora. Most systems with an acceptable accuracy are based o...
Ontology is playing an increasingly important role in knowledge management and the Semantic Web. This study presents a novel episode-based ontology construction mechanism to extra...