We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents ...
Eric J. Glover, Kostas Tsioutsiouliklis, Steve Law...
In the AllRight project, we are developing an algorithm for unsupervised table detection and segmentation that uses the visual rendition of a Web page rather than the HTML code. O...
Summarizing web pages have recently gained much attention from researchers. Until now two main types of approaches have been proposed for this task: content- and context-based met...
The method herein proposed detects text lines on handwritten pages which may include either lines oriented in several directions, erasures, or annotationsbetween main lines. The m...