Sciweavers

59 search results - page 2 / 12
» Web article extraction for web printing: a DOM visual based ...
Sort
View
WWW
2003
ACM
14 years 5 months ago
DOM-based content extraction of HTML documents
Web pages often contain clutter (such as pop-up ads, unnecessary images and extraneous links) around the body of an article that distracts a user from actual content. Extraction o...
Suhit Gupta, Gail E. Kaiser, David Neistadt, Peter...
AAAI
2006
13 years 6 months ago
Table Extraction Using Spatial Reasoning on the CSS2 Visual Box Model
Tables on web pages contain a huge amount of semantically explicit information, which makes them a worthwhile target for automatic information extraction and knowledge acquisition...
Wolfgang Gatterbauer, Paul Bohunsky
WWW
2005
ACM
14 years 5 months ago
Extracting context to improve accuracy for HTML content extraction
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo
SAC
2004
ACM
13 years 10 months ago
Classifying biological articles using web resources
Text classification systems on biomedical literature aim to select relevant articles to a specific issue from large corpora. Most systems with an acceptable accuracy are based o...
Francisco M. Couto, Bruno Martins, Mário J....
APWEB
2003
Springer
13 years 10 months ago
Extracting Content Structure for Web Pages Based on Visual Representation
Abstract. A new web content structure based on visual representation is proposed in this paper. Many web applications such as information retrieval, information extraction and auto...
Deng Cai, Shipeng Yu, Ji-Rong Wen, Wei-Ying Ma