Search Sciweavers | Sciweavers

563 search results - page 12 / 113

» Crawling the web for structured documents

152

click to vote

NSDI
2010

194views Computer Networks» more NSDI 2010»

The Architecture and Implementation of an Extensible Web Crawler

15 years 7 months ago

Download www.usenix.org

Many Web services operate their own Web crawlers to discover data of interest, despite the fact that largescale, timely crawling is complex, operationally intensive, and expensive...

Jonathan M. Hsieh, Steven D. Gribble, Henry M. Lev...

claim paper

Read More »

145

click to vote

WWW
2009
ACM

135views Internet Technology» more WWW 2009»

User-centric content freshness metrics for search engines

16 years 6 months ago

Download www2009.org

In order to return relevant search results, a search engine must keep its local repository synchronized to the Web, but it is usually impossible to attain perfect freshness. Hence...

Ali Dasdan, Xinh Huynh

claim paper

Read More »

184

click to vote

ADAPTIVE
2007
Springer

240views Internet Technology» more ADAPTIVE 2007»

Web Document Modeling

15 years 11 months ago

Download www.dcs.warwick.ac.uk

A very common issue of adaptive Web-Based systems is the modeling of documents. Such documents represent domain-speciﬁc information for a number of purposes. Application areas su...

Alessandro Micarelli, Filippo Sciarrone, Mauro Mar...

claim paper

Read More »

104

click to vote

IJIPM
2010

84views more IJIPM 2010»

Detection of Duplication in Documents and WebPages Based Documents Syntactical Structures through an Improved Longest Common Sub

15 years 2 months ago

Download www.humanpub.org

Mohamed Elhadi, Amjad Al-Tobi

claim paper

Read More »

204

click to vote

SIGIR
2012
ACM

242views Information Technology» more SIGIR 2012»

Optimizing positional index structures for versioned document collections

13 years 8 months ago

Download cis.poly.edu

Versioned document collections are collections that contain multiple versions of each document. Important examples are Web archives, Wikipedia and other wikis, or source code and ...

Jinru He, Torsten Suel

claim paper

Read More »

« Prev « First page 12 / 113 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers