Sciweavers

1166 search results - page 92 / 234
» Crash Management for Distributed Parallel Systems
Sort
View
PPOPP
2006
ACM
15 years 7 months ago
Predicting bounds on queuing delay for batch-scheduled parallel machines
Most space-sharing parallel computers presently operated by high-performance computing centers use batch-queuing systems to manage processor allocation. In many cases, users wishi...
John Brevik, Daniel Nurmi, Richard Wolski
IPPS
2010
IEEE
14 years 11 months ago
Initial characterization of parallel NFS implementations
Parallel NFS (pNFS) is touted as an emergent standard protocol for parallel I/O access in various storage environments. Several pNFS prototypes have been implemented for initial v...
Weikuan Yu, Jeffrey S. Vetter
CCGRID
2005
IEEE
15 years 7 months ago
A distributed resource and network partitioning architecture for service grids
Abstract In this paper, we propose the use of a distributed service management architecture for state-of-the-art service-enabled Grids. The architecture is capable of performing au...
Bruno Volckaert, Pieter Thysebaert, Marc De Leenhe...
HPCC
2005
Springer
15 years 7 months ago
Factory: An Object-Oriented Parallel Programming Substrate for Deep Multiprocessors
Abstract. Recent advances in processor technology such as Simultaneous Multithreading (SMT) and Chip Multiprocessing (CMP) enable parallel processing on a single die. These process...
Scott Schneider, Christos D. Antonopoulos, Dimitri...
ICMAS
2000
15 years 2 months ago
The Adaptive Agent Architecture: Achieving Fault-Tolerance Using Persistent Broker Teams
Brokers are used in many multi-agent systems for locating agents, for routing and sharing information, for managing the system, and for legal purposes, as independent third partie...
Sanjeev Kumar, Philip R. Cohen, Hector J. Levesque