Sciweavers

107 search results - page 16 / 22
» Automatic evaluation of aspects of document quality
Sort
View
LREC
2008
115views Education» more  LREC 2008»
14 years 11 months ago
Experiments on Processing Overlapping Parallel Corpora
The number and sizes of parallel corpora keep growing, which makes it necessary to have automatic methods of processing them: combining, checking and improving corpora quality, et...
Mark Fishel, Heiki Jaan Kaalep
78
Voted
DGO
2010
173views Education» more  DGO 2010»
14 years 11 months ago
Digital sustainable publication of legacy parliamentary proceedings
We address the problem of publishing parliamentary proceedings in a digital sustainable manner. We give an extensive requirements analysis, and based on that propose a uniform XML...
Maarten Marx, Nelleke Aders, Anne Schuth
CIKM
2008
Springer
14 years 11 months ago
Achieving both high precision and high recall in near-duplicate detection
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...
Lian'en Huang, Lei Wang, Xiaoming Li
120
Voted
JUCS
2008
210views more  JUCS 2008»
14 years 9 months ago
Systematic Characterisation of Objects in Digital Preservation: The eXtensible Characterisation Languages
: During the last decades, digital objects have become the primary medium to create, shape, and exchange information. However, in contrast to analog objects such as books that dire...
Christoph Becker, Andreas Rauber, Volker Heydegger...
WWW
2005
ACM
15 years 10 months ago
Learning domain ontologies for Web service descriptions: an experiment in bioinformatics
The reasoning tasks that can be performed with semantic web service descriptions depend on the quality of the domain ontologies used to create these descriptions. However, buildin...
Marta Sabou, Chris Wroe, Carole A. Goble, Gilad Mi...