Sciweavers

70 search results - page 2 / 14
» Experimental Evaluation of the DECOS Fault-Tolerant Communic...
Sort
View
ICDCS
2012
IEEE
11 years 7 months ago
Combining Partial Redundancy and Checkpointing for HPC
Today’s largest High Performance Computing (HPC) systems exceed one Petaflops (1015 floating point operations per second) and exascale systems are projected within seven years...
James Elliott, Kishor Kharbas, David Fiala, Frank ...
OTM
2009
Springer
13 years 12 months ago
Evaluating Throughput Stability of Protocols for Distributed Middleware
Communication of large data volumes is a core functionality of distributed systems middleware, namely, for interconnecting components, for distributed computation and for fault tol...
Nuno Carvalho, José P. Oliveira, José...
SIGCOMM
2009
ACM
13 years 11 months ago
PortLand: a scalable fault-tolerant layer 2 data center network fabric
This paper considers the requirements for a scalable, easily manageable, fault-tolerant, and efficient data center network fabric. Trends in multi-core processors, end-host virtua...
Radhika Niranjan Mysore, Andreas Pamboris, Nathan ...
INFOCOM
2011
IEEE
12 years 8 months ago
Experimental evaluation of optimal CSMA
Abstract—By ‘optimal CSMA’ we denote a promising approach to maximize throughput-based utility in wireless networks without message passing or synchronization among nodes. De...
Bruno Nardelli, Jinsung Lee, Kangwook Lee, Yung Yi...
EDCC
2008
Springer
13 years 7 months ago
A Distributed Approach to Autonomous Fault Treatment in Spread
This paper presents the design and implementation of the Distributed Autonomous Replication Management (DARM) framework built on top of the Spread group communication system. The ...
Hein Meling, Joakim L. Gilje