Search Sciweavers | Sciweavers

98

WWW
2005
ACM

150views Internet Technology» more WWW 2005»

Extracting context to improve accuracy for HTML content extraction

16 years 12 days ago

Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...

Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo

claim paper

Read More »

91

click to vote

WWW
2003
ACM

99views Internet Technology» more WWW 2003»

The XML web: a first study

16 years 12 days ago

Download www.cs.toronto.edu

Although originally designed for large-scale electronic publishing, XML plays an increasingly important role in the exchange of data on the Web. In fact, it is expected that XML w...

Laurent Mignet, Denilson Barbosa, Pierangelo Veltr...

claim paper

Read More »

84

click to vote

HT
2005
ACM

147views Internet Technology» more HT 2005»

Processing link structures and linkbases in the web's open world linking

15 years 5 months ago

Download www.pms.ifi.lmu.de

Hyperlinks are an essential feature of the World Wide Web, highly responsible for its success. XLink improves on HTML’s linking capabilities in several ways. In particular, link...

François Bry, Michael Eckert

claim paper

Read More »

79

click to vote

SAC
2005
ACM

153views Applied Computing» more SAC 2005»

Automatic extraction of informative blocks from webpages

15 years 5 months ago

Download clgiles.ist.psu.edu

Search engines crawl and index webpages depending upon their informative content. However, webpages — especially dynamically generated ones — contain items that cannot be clas...

Sandip Debnath, Prasenjit Mitra, C. Lee Giles

claim paper

Read More »

94

click to vote

DOCENG
2004
ACM

119views Document Analysis» more DOCENG 2004»

Page composition using PPML as a link-editing script

15 years 5 months ago

Download www.eprg.org

The advantages of a COG (Component Object Graphic) approach to the composition of PDF pages have been set out in a previous paper [1]. However, if pages are to be composed in this...

Steven R. Bagley, David F. Brailsford

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers