Data stream applications have made use of statistical summaries to reason about the data using nonparametric tools such as histograms, heavy hitters, and join sizes. However, rela...
This paper describes the Network-Attached Secure Disk (NASD) storage architecture, prototype implementations of NASD drives, array management for our architecture, and three files...
Garth A. Gibson, David Nagle, Khalil Amiri, Jeff B...
Background: The goal of information integration in systems biology is to combine information from a number of databases and data sets, which are obtained from both high and low th...
Michael Baitaluk, Xufei Qian, Shubhada Godbole, Al...
Analyzing and controlling large distributed services under a wide range of conditions is difficult. Yet these capabilities are essential to a number of important development and o...
David L. Oppenheimer, Vitaliy Vatkovskiy, Hakim We...
In this paper we introduce the Generalized Bayesian Committee Machine (GBCM) for applications with large data sets. In particular, the GBCM can be used in the context of kernel ba...