Data mining is increasingly performed by people who are not computer scientists or professional programmers. It is often done as an iterative process involving multiple ad-hoc tas...
Caching techniques have been used to improve the performance gap of storage hierarchies in computing systems. In data intensive applications that access large data files over wid...
The distributed data storage on unreliable devices, connected by a short-range radio network is analyzed. Failing devices incur loss of data. To prevent the loss, the data is spli...
This paper reports our experiences on the Scalable Network Of Workstation (SNOW) project, which implements a novel methodology to support user-level process migration for traditio...
For many types of machine learning algorithms, one can compute the statistically optimal" way to select training data. In this paper, we review how optimal data selection tec...
David A. Cohn, Zoubin Ghahramani, Michael I. Jorda...