Search Sciweavers | Sciweavers

162

WWW
2010
ACM

188views Internet Technology» more WWW 2010»

Exploiting content redundancy for web information extraction

15 years 3 months ago

We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...

Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...

claim paper

Read More »

219

Voted

ICDE
2007
IEEE

173views Database» more ICDE 2007»

Annotating Structured Data of the Deep Web

16 years 4 months ago

Download www.cs.binghamton.edu

An increasing number of databases have become Web accessible through HTML form-based search interfaces. The data units returned from the underlying database are usually encoded in...

Yiyao Lu, Hai He, Hongkun Zhao, Weiyi Meng, Clemen...

claim paper

Read More »

126

Voted

WWW
2008
ACM

139views Internet Technology» more WWW 2008»

Sailer: an effective search engine for unified retrieval of heterogeneous xml and web documents

16 years 4 months ago

Download www2008.org

This paper studies the problem of unified ranked retrieval of heterogeneous XML documents and Web data. We propose an effective search engine called Sailer to adaptively and versa...

Guoliang Li, Jianhua Feng, Jianyong Wang, Xiaoming...

claim paper

Read More »

109

click to vote

WWW
2003
ACM

147views Internet Technology» more WWW 2003»

Scaling personalized web search

16 years 4 months ago

Download infolab.stanford.edu

Recent web search techniques augment traditional text matching with a global notion of "importance" based on the linkage structure of the web, such as in Google's P...

Glen Jeh, Jennifer Widom

claim paper

Read More »

139

Voted

WWW
2001
ACM

184views Internet Technology» more WWW 2001»

Seeing the whole in parts: text summarization for web browsing on handheld devices

16 years 4 months ago

Download www10.org

We introduce five methods for summarizing parts of Web pages on handheld devices, such as personal digital assistants (PDAs), or cellular phones. Each Web page is broken into text...

Orkut Buyukkokten, Hector Garcia-Molina, Andreas P...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers