Abstract—For a complex distributed system to be dependable, it must be continuously monitored, so that its failures and imperfections can be discovered and corrected in a timely ...
Constantin Serban, Wenxuan Zhang, Naftaly H. Minsk...
Fast networks have made it possible to coordinate distributed heterogeneous CPU, memory, and storage resources to provide a powerful platform for executing high-performance applic...
In recent years, several frameworks have been developed for processing very large quantities of data on large clusters of commodity PCs. These frameworks have focused on fault-tole...
We consider storage in an extremely large-scale distributed computer system designed for stream processing applications. In such systems, incoming data and intermediate results ma...
Kirsten Hildrum, Fred Douglis, Joel L. Wolf, Phili...
Two ideas taken from Bayesian optimization and classifier systems are presented for personnel scheduling based on choosing a suitable scheduling rule from a set for each person’s...