: Statistics that accurately describe the distribution of data values in the columns of relational tables are essential for effective query optimization in a database management sy...
Alexander Behm, Volker Markl, Peter J. Haas, Kesha...
In this paper we address the problem of minimizing the response time of a multi-way join query using pipelined (inter-operator) parallelism, in a parallel or a distributed environ...
A major cost in executing queries in a distributed database system is the data transfer cost incurred in transferring relations (fragments) accessed by a query from different sites...
A main problem of data integration is the treatment of conflicts caused by different modeling of real-world entities, different data models or simply by different representations ...
Scientific data in the life sciences is distributed over various independent multi-format databases and is constantly expanding. We discuss a scenario where a life science research...