In this work we tackle the open problem of self-join size (SJS) estimation in a large-scale Distributed Data System, where tuples of a relation are distributed over data nodes whic...
We introduce a network-based problem detection framework for distributed systems, which includes a data-mining method for discovering dynamic dependencies among distributed servic...
RDF is a data model for representing labeled directed graphs, and it is used as an important building block of semantic web. Due to its flexibility and applicability, RDF has bee...
Hyunsik Choi, Jihoon Son, YongHyun Cho, Min Kyoung...
We present a replication-based approach to fault-tolerant distributed stream processing in the face of node failures, network failures, and network partitions. Our approach aims t...
Magdalena Balazinska, Hari Balakrishnan, Samuel Ma...
Motivated by the increasing need to analyze complex, uncertain multidimensional data this paper proposes probabilistic OLAP queries that are computed using probability distributio...
Igor Timko, Curtis E. Dyreson, Torben Bach Pederse...