Search Sciweavers | Sciweavers

43 search results - page 1 / 9

» Efficient similarity joins for near duplicate detection

WWW
2008
ACM

214views Internet Technology» more WWW 2008»

14 years 10 months ago

Efficient similarity joins for near duplicate detection

Download www2008.org

With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...

Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...

claim paper

Read More »

click to vote

ICDE
2009
IEEE

194views Database» more ICDE 2009»

Top-k Set Similarity Joins

14 years 11 months ago

Download www.cse.unsw.edu.au

Abstract-- Similarity join is a useful primitive operation underlying many applications, such as near duplicate Web page detection, data integration, and pattern recognition. Tradi...

Chuan Xiao, Wei Wang 0011, Xuemin Lin, Haichuan Sh...

claim paper

Read More »

click to vote

ESWS
2010
Springer

138views Internet Technology» more ESWS 2010»

Efficient Semantic-Aware Detection of Near Duplicate Resources

14 years 21 days ago

Download www.l3s.de

Abstract. Efficiently detecting near duplicate resources is an important task when integrating information from various sources and applications. Once detected, near duplicate reso...

Ekaterini Ioannou, Odysseas Papapetrou, Dimitrios ...

claim paper

Read More »

click to vote

SIGIR
2008
ACM

176views Information Technology» more SIGIR 2008»

SpotSigs: robust and efficient near duplicate detection in large web collections

13 years 9 months ago

Download ilpubs.stanford.edu

Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...

Martin Theobald, Jonathan Siddharth, Andreas Paepc...

claim paper

Read More »

click to vote

PVLDB
2010

126views more PVLDB 2010»

13 years 7 months ago

Set Similarity Join on Probabilistic Data

Download www.comp.nus.edu.sg

Set similarity join has played an important role in many real-world applications such as data cleaning, near duplication detection, data integration, and so on. In these applicati...

Xiang Lian, Lei Chen 0002

claim paper

Read More »

« Prev « First page 1 / 9 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers