Sciweavers

43 search results - page 5 / 9
» Efficient similarity joins for near duplicate detection
Sort
View
ICMCS
2006
IEEE
188views Multimedia» more  ICMCS 2006»
15 years 3 months ago
Large-Scale Duplicate Detection for Web Image Search
Finding visually identical images in large image collections is important for many applications such as intelligence propriety protection and search result presentation. Several a...
Bin Wang, Zhiwei Li, Mingjing Li, Wei-Ying Ma
SIGIR
2010
ACM
14 years 4 months ago
Efficient partial-duplicate detection based on sequence matching
With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since parti...
Qi Zhang, Yue Zhang, Haomin Yu, Xuanjing Huang
DEXA
2004
Springer
136views Database» more  DEXA 2004»
15 years 3 months ago
PC-Filter: A Robust Filtering Technique for Duplicate Record Detection in Large Databases
: In this paper, we will propose PC-Filter (PC stands for Partition Comparison), a robust data filter for approximately duplicate record detection in large databases. PC-Filter dis...
Ji Zhang, Tok Wang Ling, Robert M. Bruckner, Han L...
ICAIL
2007
ACM
15 years 1 months ago
Essential deduplication functions for transactional databases in law firms
As massive document repositories and knowledge management systems continue to expand, in proprietary environments as well as on the Web, the need for duplicate detection becomes i...
Jack G. Conrad, Edward L. Raymond
INFOCOM
2010
IEEE
14 years 7 months ago
Efficient Similarity Estimation for Systems Exploiting Data Redundancy
Many modern systems exploit data redundancy to improve efficiency. These systems split data into chunks, generate identifiers for each of them, and compare the identifiers among ot...
Kanat Tangwongsan, Himabindu Pucha, David G. Ander...