Sciweavers

91 search results - page 3 / 19
» Web document text and images extraction using DOM analysis a...
Sort
View
ICDAR
2011
IEEE
12 years 5 months ago
Language-Independent Text Lines Extraction Using Seam Carving
Abstract—In this paper, we present a novel languageindependent algorithm for extracting text-lines from handwritten document images. Our algorithm is based on the seam carving ap...
Raid Saabni, Jihad El-Sana
ICDAR
2007
IEEE
13 years 9 months ago
WEB Image Classification Based on the Fusion of Image and Text Classifiers
This paper presents a novel method for the classification of images that combines information extracted from the images and contextual information. The main hypothesis is that con...
Pedro R. Kalva, Fabrício Enembreck, Alessan...
CIKM
2007
Springer
13 years 11 months ago
The role of documents vs. queries in extracting class attributes from text
Challenging the implicit reliance on document collections, this paper discusses the pros and cons of using query logs rather than document collections, as self-contained sources o...
Marius Pasca, Benjamin Van Durme, Nikesh Garera
WWW
2005
ACM
14 years 6 months ago
Extracting context to improve accuracy for HTML content extraction
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo
ICDAR
2003
IEEE
13 years 10 months ago
Localization, Extraction and Recognition of Text in Telugu Document Images
In this paper we present a system to locate, extract and recognize Telugu text. The circular nature of Telugu script is exploited for segmenting text regions using the Hough Trans...
Atul Negi, K. Nikhil Shanker, Chandra Kanth Chered...