Many distributed applications can make use of large background transfers ? transfers of data that humans are not waiting for ? to improve availability, reliability, latency or con...
Estimating the cardinality (i.e. number of distinct elements) of an arbitrary set expression defined over multiple distributed streams is one of the most fundamental queries of in...
In the rank join problem, we are given a set of relations and a scoring function, and the goal is to return the join results with the top K scores. It is often the case in practic...
Abstract. Clustering is a problem of great practical importance in numerous applications. The problem of clustering becomes more challenging when the data is categorical, that is, ...
Scientific data in the life sciences is distributed over various independent multi-format databases and is constantly expanding. We discuss a scenario where a life science research...