Sciweavers

282 search results - page 24 / 57
» Reliability and Scheduling on Systems Subject to Failures
Sort
View
JAVA
2001
Springer
15 years 2 months ago
A scalable, robust network for parallel computing
CX, a network-based computational exchange, is presented. The system’s design integrates variations of ideas from other researchers, such as work stealing, non-blocking tasks, e...
Peter R. Cappello, Dimitros Mourloukos
NOSSDAV
2005
Springer
15 years 3 months ago
1-800-OVERLAYS: using overlay networks to improve VoIP quality
The cost savings and novel features associated with Voice over IP (VoIP) are driving its adoption by service providers. Such a transition however can successfully happen only if t...
Yair Amir, Claudiu Danilov, Stuart Goose, David He...
SOSP
2007
ACM
15 years 6 months ago
Dynamo: amazon's highly available key-value store
Reliability at massive scale is one of the biggest challenges we face at Amazon.com, one of the largest e-commerce operations in the world; even the slightest outage has significa...
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, ...
CLOUD
2010
ACM
15 years 3 months ago
Making cloud intermediate data fault-tolerant
Parallel dataflow programs generate enormous amounts of distributed data that are short-lived, yet are critical for completion of the job and for good run-time performance. We ca...
Steven Y. Ko, Imranul Hoque, Brian Cho, Indranil G...
EUMAS
2006
14 years 11 months ago
DimaX: A Fault-Tolerant Multi-Agent Platform
Fault tolerance is an important property of large-scale multiagent systems as the failure rate grows with both the number of the hosts and deployed agents, and the duration of com...
Nora Faci, Zahia Guessoum, Olivier Marin