Background: Horizontal gene transfer (HGT) is considered a strong evolutionary force shaping the content of microbial genomes in a substantial manner. It is the difference in spee...
Stephan Waack, Oliver Keller, Roman Asper, Thomas ...
We propose efficient techniques for processing various TopK count queries on data with noisy duplicates. Our method differs from existing work on duplicate elimination in two sign...
Sunita Sarawagi, Vinay S. Deshpande, Sourabh Kasli...
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
Background: Genomic data provide a wealth of new information for phylogenetic analysis. Yet making use of this data requires phylogenetic methods that can efficiently analyze extr...
Mukul S. Bansal, J. Gordon Burleigh, Oliver Eulens...
Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...