Search Sciweavers | Sciweavers

502 search results - page 11 / 101

» Extracting Partial Structures from HTML Documents

171

click to vote

WWW
2009
ACM

213views Internet Technology» more WWW 2009»

Extracting article text from the web with maximum subsequence segmentation

16 years 6 months ago

Download www2009.org

Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...

Jeff Pasternack, Dan Roth

claim paper

Read More »

160

click to vote

DOCENG
2009
ACM

139views Document Analysis» more DOCENG 2009»

Web document text and images extraction using DOM analysis and natural language processing

16 years 16 days ago

Download www.hpl.hp.com

: © Web Document Text and Images Extraction using DOM Analysis and Natural Language Processing Parag Mulendra Joshi, Sam Liu HP Laboratories HPL-2009-187 Web page text extraction,...

Parag Mulendra Joshi, Sam Liu

claim paper

Read More »

165

click to vote

EACL
2006
ACL Anthology

91views Natural Language Processing» more EACL 2006»

Multilingual Term Extraction from Domain-specific Corpora Using Morphological Structure

15 years 7 months ago

Download acl.ldc.upenn.edu

Morphologically complex terms composed from Greek or Latin elements are frequent in scientific and technical texts. Word forming units are thus relevant cues for the identificatio...

Delphine Bernhard

claim paper

Read More »

183

click to vote

WEBDB
2009
Springer

149views Database» more WEBDB 2009»

Extracting Route Directions from Web Pages

16 years 18 days ago

Download webdb09.cse.buffalo.edu

Linguists and geographers are more and more interested in route direction documents because they contain interesting motion descriptions and language patterns. A large number of s...

Xiao Zhang, Prasenjit Mitra, Sen Xu, Anuj R. Jaisw...

claim paper

Read More »

165

click to vote

ICDAR
2003
IEEE

169views Document Analysis» more ICDAR 2003»

Document Transformation System from Papers to XML Data Based on Pivot XML Document Method

15 years 11 months ago

Download www.cse.salford.ac.uk

This paper proposes a new method for document transformation using OCR to generate various XML documents from printed documents. The proposed method adopts a hierarchical transfor...

Yasuto Ishitani

claim paper

Read More »

« Prev « First page 11 / 101 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers