Enabling a domain expert to maintain his own knowledge in a Knowledge Based System has long been an ideal for the Knowledge Engineering community. In this paper we report on our ex...
Parallel corpus is a rich linguistic resource for various multilingual text management tasks, including crosslingual text retrieval, multilingual computational linguistics and mul...
: We describe our participation in the TREC 2004 Web and Terabyte tracks. For the web track, we employ mixture language models based on document full-text, incoming anchortext, and...
The capabilities of XSLT processing are widely used to transform XML documents into target XML documents. These target XML documents conform to output schemas of the used XSLT styl...
This paper presents an objective comparative evaluation of layout analysis methods in realistic circumstances. It describes the Page Segmentation competition (modus operandi, data...
Apostolos Antonacopoulos, Stefan Pletschacher, Dav...