Sampling is a widely used technique to increase efficiency in database and data mining applications operating on large dataset. In this paper we present a scalable sampling imple...
While the IETF standardization process of the Mobile IPv6 and Network Mobility (NEMO) protocols is almost complete, their large-scale deployment is not yet possible. With these te...
Abstract. The development of scalable parallel database systems requires the design of efficient algorithms for the join operation which is the most frequent and expensive operatio...
Clustering is a data mining problem which finds dense regions in a sparse multi-dimensional data set. The attribute values and ranges of these regions characterize the clusters. ...
Many emerging large-scale data science applications require searching large graphs distributed across multiple memories and processors. This paper presents a distributed breadth...
Andy Yoo, Edmond Chow, Keith W. Henderson, Will Mc...