Sciweavers

VLDB
2002
ACM
110views Database» more  VLDB 2002»
13 years 4 months ago
Eliminating Fuzzy Duplicates in Data Warehouses
The duplicate elimination problem of detecting multiple tuples, which describe the same real world entity, is an important data cleaning problem. Previous domain independent solut...
Rohit Ananthakrishna, Surajit Chaudhuri, Venkatesh...
FAST
2010
13 years 7 months ago
Bimodal Content Defined Chunking for Backup Streams
Data deduplication has become a popular technology for reducing the amount of storage space necessary for backup and archival data. Content defined chunking (CDC) techniques are w...
Erik Kruus, Cristian Ungureanu, Cezary Dubnicki
WEBDB
2004
Springer
80views Database» more  WEBDB 2004»
13 years 10 months ago
Unraveling the Duplicate-Elimination Problem in XML-to-SQL Query Translation
We consider the scenario where existing relational data is exported as XML. In this context, we look at the problem of translating XML queries into SQL. XML query languages have t...
Rajasekar Krishnamurthy, Raghav Kaushik, Jeffrey F...
SSDBM
2007
IEEE
111views Database» more  SSDBM 2007»
13 years 11 months ago
Duplicate Elimination in Space-partitioning Tree Indexes
Space-partitioning trees, like the disk-based trie, quadtree, kd-tree and their variants, are a family of access methods that index multi-dimensional objects. In the case of index...
Mohamed Y. Eltabakh, Mourad Ouzzani, Walid G. Aref