Given a set of strings S of equal lengths over an alphabet Σ, the closest string problem seeks a string over Σ whose maximum Hamming distance to any of the given strings is as s...
Histograms are used to summarize the contents of relations into a number of buckets for the estimation of query result sizes. Several techniques (e.g., MaxDiff and V-Optimal) have ...
Francesco Buccafurri, Gianluca Lax, Domenico SaccÃ...
The Jaccard/Tanimoto coefficient is an important workload, used in a large variety of problems including drug design fingerprinting, clustering analysis, similarity web searching a...
Vipin Sachdeva, Douglas M. Freimuth, Chris Mueller
E-Services are becoming as promising technology for the effective automation of application integration across Web. They build upon XML, as data format for exchanging messages bet...
Given a record set D and a query score function F, a top-k query returns k records from D, whose values of function F on their attributes are the highest. In this paper, we investi...