Search Sciweavers | Sciweavers

874 search results - page 36 / 175

» Jedi: Extracting and Synthesizing Information from the Web

139

click to vote

WWW
2005
ACM

154views Internet Technology» more WWW 2005»

Thresher: automating the unwrapping of semantic content from the World Wide Web

16 years 2 months ago

Download www2005.org

We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...

Andrew Hogue, David R. Karger

claim paper

Read More »

100

click to vote

WWW
2009
ACM

142views Internet Technology» more WWW 2009»

Estimating web site readability using content extraction

16 years 2 months ago

Download www2009.eprints.org

Nowadays, information is primarily searched on the WWW. From a user perspective, the readability is an important criterion for measuring the accessibility and thereby the quality ...

Thomas Gottron, Ludger Martin

claim paper

Read More »

112

Voted

LREC
2008

133views Education» more LREC 2008»

Automatic Identification of Temporal Information in Tourism Web Pages

15 years 3 months ago

Download www.lrec-conf.org

This paper presents our work on the detection of temporal information in web pages. The pages examined within the scope of this study were taken from the tourism sector and the te...

Stéphanie Weiser, Philippe Laublet, Jean-Lu...

claim paper

Read More »

211

click to vote

SIGMOD
2008
ACM

159views Database» more SIGMOD 2008»

Web-scale extraction of structured data

16 years 2 months ago

Download turing.cs.washington.edu

A long-standing goal of Web research has been to construct a unified Web knowledge base. Information extraction techniques have shown good results on Web inputs, but even most dom...

Michael J. Cafarella, Jayant Madhavan, Alon Y. Hal...

claim paper

Read More »

111

click to vote

WWW
2005
ACM

150views Internet Technology» more WWW 2005»

Extracting context to improve accuracy for HTML content extraction

16 years 2 months ago

Download www1.cs.columbia.edu

Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...

Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo

claim paper

Read More »

« Prev « First page 36 / 175 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers