Search Sciweavers | Sciweavers

72 search results - page 9 / 15

» Ontology-Focused Crawling of Web Documents

103

Voted

WEBDB
2005
Springer

129views Database» more WEBDB 2005»

Searching for Hidden-Web Databases

15 years 5 months ago

Download www.cs.utah.edu

Recently, there has been increased interest in the retrieval and integration of hidden Web data with a view to leverage high-quality information available in online databases. Alt...

Luciano Barbosa, Juliana Freire

claim paper

Read More »

Voted

IADIS
2003

91views Internet Technology» more IADIS 2003»

SPLAT: A System for Self-Plagiarism Detection

15 years 1 months ago

Download splat.cs.arizona.edu

This paper presents a system for self-plagiarism detection, SPLAT. The system uses a WebL web spider that crawls through the web sites of the top fifty Computer Science department...

Christian S. Collberg, Stephen G. Kobourov, Joshua...

claim paper

Read More »

click to vote

ECIR
2006
Springer

134views Information Technology» more ECIR 2006»

Automatic Document Organization in a P2P Environment

15 years 1 months ago

Download ir.shef.ac.uk

Abstract. This paper describes an efficient method to construct reliable machine learning applications in peer-to-peer (P2P) networks by building ensemble based meta methods. We co...

Stefan Siersdorfer, Sergej Sizov

claim paper

Read More »

Voted

LAWEB
2003
IEEE

96views Internet Technology» more LAWEB 2003»

On the Evolution of Clusters of Near-Duplicate Web Pages

15 years 5 months ago

Download research.microsoft.com

This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis ove...

Dennis Fetterly, Mark Manasse, Marc Najork

claim paper

Read More »

Voted

CLEF
2010
Springer

159views Information Technology» more CLEF 2010»

MapReduce for Information Retrieval Evaluation: "Let's Quickly Test This on 12 TB of Data"

15 years 1 months ago

Download eprints.eemcs.utwente.nl

We propose to use MapReduce to quickly test new retrieval approaches on a cluster of machines by sequentially scanning all documents. We present a small case study in which we use ...

Djoerd Hiemstra, Claudia Hauff

claim paper

Read More »

« Prev « First page 9 / 15 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers