Many languages are in serious danger of being lost and as a result, there has been a significant increase in language documentation projects, and also in attempts to preserve lang...
Despite the current practice of re-keying most documents placed in digital libraries, we continue to try to improve accuracy of automated recognition techniques for obtaining docum...
Commercial OCR packages work best with highquality scanned images. They often produce poor results when the image is degraded, either because the original itself was poor quality,...
In this paper we propose a methodology to learn to extract domain-specific information from large repositories (e.g. the Web) with minimum user intervention. Learning is seeded b...
Fabio Ciravegna, Alexiei Dingli, David Guthrie, Yo...
Our approach to the Log Analysis for Digital Societies (LADS) task of LogCLEF 2009 is to define three different levels of performance: success, failure and strong failure. To inve...