The number and sizes of parallel corpora keep growing, which makes it necessary to have automatic methods of processing them: combining, checking and improving corpora quality, et...
We address the problem of publishing parliamentary proceedings in a digital sustainable manner. We give an extensive requirements analysis, and based on that propose a uniform XML...
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...
: During the last decades, digital objects have become the primary medium to create, shape, and exchange information. However, in contrast to analog objects such as books that dire...
Christoph Becker, Andreas Rauber, Volker Heydegger...
The reasoning tasks that can be performed with semantic web service descriptions depend on the quality of the domain ontologies used to create these descriptions. However, buildin...
Marta Sabou, Chris Wroe, Carole A. Goble, Gilad Mi...