Search Sciweavers | Sciweavers

563 search results - page 23 / 113

» Crawling the web for structured documents

175

click to vote

RIAO
2007

167views Information Technology» more RIAO 2007»

From Layout to Semantic: a Reranking Model for Mapping Web Documents to Mediated XML Representations

15 years 7 months ago

Download eprints.pascal-network.org

Many documents on the Web are formated in a weakly structured format. Because of their weak semantic and because of the heterogeneity of their formats, the information conveyed by...

Guillaume Wisniewski, Patrick Gallinari

claim paper

Read More »

184

click to vote

WECWIS
2003
IEEE

132views ECommerce» more WECWIS 2003»

Page Digest for Large-Scale Web Services

15 years 11 months ago

Download www.westga.edu

The rapid growth of the World Wide Web and the Internet has fueled interest in Web services and the Semantic Web, which are quickly becoming important parts of modern electronic c...

Daniel Rocco, David Buttler, Ling Liu

claim paper

Read More »

160

click to vote

ICDM
2002
IEEE

162views Data Mining» more ICDM 2002»

Phrase-based Document Similarity Based on an Index Graph Model

15 years 11 months ago

Download pami.uwaterloo.ca

Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...

Khaled M. Hammouda, Mohamed S. Kamel

claim paper

Read More »

155

click to vote

WEBDB
2005
Springer

102views Database» more WEBDB 2005»

Design and Implementation of a Geographic Search Engine

15 years 11 months ago

Download cis.poly.edu

In this paper, we describe the design and initial implementation of a geographic search engine prototype for Germany, based on a large crawl of the de domain. Geographic search en...

Alexander Markowetz, Yen-Yu Chen, Torsten Suel, Xi...

claim paper

Read More »

275

click to vote

ICDE
2008
IEEE

218views Database» more ICDE 2008»

AxPRE Summaries: Exploring the (Semi-)Structure of XML Web Collections

16 years 7 months ago

Download www.cs.toronto.edu

The nature of semistructured data in web collections is evolving. Increasingly, XML web documents (or documents exchanged via web services) are valid with regard to a schema, yet ...

Mariano P. Consens, Flavio Rizzolo, Alejandro A. V...

claim paper

Read More »

« Prev « First page 23 / 113 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers