Sciweavers

37 search results - page 2 / 8
» Extending Page Segmentation Algorithms for Mixed-Layout Docu...
Sort
View
JCDL
2006
ACM
167views Education» more  JCDL 2006»
13 years 11 months ago
Combining DOM tree and geometric layout analysis for online medical journal article segmentation
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
Jie Zou, Daniel X. Le, George R. Thoma
WWW
2002
ACM
14 years 6 months ago
Using web structure for classifying and describing web pages
The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents ...
Eric J. Glover, Kostas Tsioutsiouliklis, Steve Law...
WWW
2006
ACM
14 years 6 months ago
Visually guided bottom-up table detection and segmentation in web documents
In the AllRight project, we are developing an algorithm for unsupervised table detection and segmentation that uses the visual rendition of a Web page rather than the HTML code. O...
Bernhard Krüpl, Marcus Herzog
WWW
2004
ACM
14 years 6 months ago
Web page summarization using dynamic content
Summarizing web pages have recently gained much attention from researchers. Until now two main types of approaches have been proposed for this task: content- and context-based met...
Adam Jatowt
ICDAR
1995
IEEE
13 years 8 months ago
A Hough based algorithm for extracting text lines in handwritten documents
The method herein proposed detects text lines on handwritten pages which may include either lines oriented in several directions, erasures, or annotationsbetween main lines. The m...
Laurence Likforman-Sulem, Anahid Hanimyan, Claudie...