Search Sciweavers | Sciweavers

71 search results - page 1 / 15

» The Case of the Duplicate Documents Measurement, Search, and...

click to vote

APWEB
2006
Springer

102views Internet Technology» more APWEB 2006»

The Case of the Duplicate Documents Measurement, Search, and Science

13 years 8 months ago

Download goanna.cs.rmit.edu.au

Many of the documents in large text collections are duplicates and versions of each other. In recent research, we developed new methods for finding such duplicates; however, as the...

Justin Zobel, Yaniv Bernstein

claim paper

Read More »

click to vote

SIGIR
2004
ACM

136views Information Technology» more SIGIR 2004»

Constructing a text corpus for inexact duplicate detection

13 years 10 months ago

Download www.conradweb.org

As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...

Jack G. Conrad, Cindy P. Schriber

claim paper

Read More »

click to vote

WWW
2007
ACM

108views Internet Technology» more WWW 2007»

Efficient search engine measurements

14 years 5 months ago

Download www2007.org

We address the problem of measuring global quality metrics of search engines, like corpus size, index freshness, and density of duplicates in the corpus. The recently proposed est...

Ziv Bar-Yossef, Maxim Gurevich

claim paper

Read More »

click to vote

WWW
2009
ACM

227views Internet Technology» more WWW 2009»

Measuring the similarity between implicit semantic relations from the web

14 years 5 months ago

Download www2009.org

Measuring the similarity between semantic relations that hold among entities is an important and necessary step in various Web related tasks such as relation extraction, informati...

Danushka Bollegala, Yutaka Matsuo, Mitsuru Ishizuk...

claim paper

Read More »

click to vote

KDD
2006
ACM

185views Data Mining» more KDD 2006»

Understanding Content Reuse on the Web: Static and Dynamic Analyses

14 years 5 months ago

Download homepages.dcc.ufmg.br

Abstract. In this paper we present static and dynamic studies of duplicate and near-duplicate documents in the Web. The static and dynamic studies involve the analysis of similar c...

Ricardo A. Baeza-Yates, Álvaro R. Pereira J...

claim paper

Read More »

« Prev « First page 1 / 15 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers