Commercial OCR packages work best with highquality scanned images. They often produce poor results when the image is degraded, either because the original itself was poor quality,...
Abstract. The current UML semantics documentation has made a signi cant step towards providing a precise description of the UML. However, at present the semantic model it proposes ...
If future electronic documents are to be truly useful, we must devise ways to automatically turn them into knowledgebases. In particular, we must be able to do this for diagrams. ...
We propose an online topic model for sequentially analyzing the time evolution of topics in document collections. Topics naturally evolve with multiple timescales. For example, so...
This paper details the participation of the XLDB group from the University of Lisbon at the GeoCLEF task of CLEF 2006. We tested text mining methods that make use of an ontology t...
Bruno Martins, Nuno Cardoso, Marcirio Silveira Cha...