The Web is constantly changing, but most tools used to access Web content deal only with what can be captured at a single instance in time. As a result, Web users may not have a g...
ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...
In this paper, we present InfoScent Evaluator, a tool that automatically evaluates the semantic appropriateness of the descriptions of hyperlinks in web pages. The tool is based o...
Christos Katsanos, Nikolaos K. Tselios, Nikolaos M...
We describe the design and use of a personal digital library system, UpLib. The system consists of a full-text indexed repository accessed through an active agent via a Web interf...
In this paper we present an algorithm for automatic extraction of textual elements, namely titles and full text, associated with news stories in news web pages. We propose a super...