Sciweavers

63 search results - page 1 / 13
» Performance and effectiveness trade-off for checkpointing in...
Sort
View
DSN
2004
IEEE
13 years 9 months ago
Optimal Object State Transfer - Recovery Policies for Fault Tolerant Distributed Systems
Recent developments in the field of object-based fault tolerance and the advent of the first OMG FTCORBA compliant middleware raise new requirements for the design process of dist...
Panagiotis Katsaros, Constantine Lazos
DSN
2009
IEEE
14 years 2 days ago
A QoS-aware fault tolerant middleware for dependable service composition
Based on the framework of service-oriented architecture (SOA), complex distributed systems can be dynamically and automatically composed by integrating distributed Web services pr...
Zibin Zheng, Michael R. Lyu
HASE
1996
IEEE
13 years 9 months ago
Adaptive recovery for mobile environments
Mobile computing allows ubiquitous and continuousaccess to computing resources while the users travel or work at a client's site. The flexibility introduced by mobile computi...
Nuno Neves, W. Kent Fuchs
HPDC
2012
IEEE
11 years 7 months ago
Understanding the effects and implications of compute node related failures in hadoop
Hadoop has become a critical component in today’s cloud environment. Ensuring good performance for Hadoop is paramount for the wide-range of applications built on top of it. In ...
Florin Dinu, T. S. Eugene Ng
HIPC
2007
Springer
13 years 11 months ago
A Scalable Asynchronous Replication-Based Strategy for Fault Tolerant MPI Applications
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...
John Paul Walters, Vipin Chaudhary