Space efficiency and data reliability are two primary concerns for modern storage systems. Chunk-based deduplication, which breaks up data objects into single-instance chunks that...
—Fuzzy/similarity joins have been widely studied in the research community and extensively used in real-world applications. This paper proposes and evaluates several algorithms f...
Foto N. Afrati, Anish Das Sarma, David Menestrina,...
Clustering stability is an increasingly popular family of methods for performing model selection in data clustering. The basic idea is that the chosen model should be stable under...
The actor model has already proven itself as an interesting concurrency model that avoids issues such as deadlocks and race conditions by construction, and thus facilitates concur...
Razborov and Rudich have shown that so-called natural proofs are not useful for separating P from NP unless hard pseudorandom number generators do not exist. This famous result is...