Sciweavers

1166 search results - page 92 / 234
» Crash Management for Distributed Parallel Systems
Sort
View
PPOPP
2006
ACM
16 years 1 days ago
Predicting bounds on queuing delay for batch-scheduled parallel machines
Most space-sharing parallel computers presently operated by high-performance computing centers use batch-queuing systems to manage processor allocation. In many cases, users wishi...
John Brevik, Daniel Nurmi, Richard Wolski
IPPS
2010
IEEE
15 years 4 months ago
Initial characterization of parallel NFS implementations
Parallel NFS (pNFS) is touted as an emergent standard protocol for parallel I/O access in various storage environments. Several pNFS prototypes have been implemented for initial v...
Weikuan Yu, Jeffrey S. Vetter
CCGRID
2005
IEEE
15 years 11 months ago
A distributed resource and network partitioning architecture for service grids
Abstract In this paper, we propose the use of a distributed service management architecture for state-of-the-art service-enabled Grids. The architecture is capable of performing au...
Bruno Volckaert, Pieter Thysebaert, Marc De Leenhe...
HPCC
2005
Springer
15 years 11 months ago
Factory: An Object-Oriented Parallel Programming Substrate for Deep Multiprocessors
Abstract. Recent advances in processor technology such as Simultaneous Multithreading (SMT) and Chip Multiprocessing (CMP) enable parallel processing on a single die. These process...
Scott Schneider, Christos D. Antonopoulos, Dimitri...
ICMAS
2000
15 years 7 months ago
The Adaptive Agent Architecture: Achieving Fault-Tolerance Using Persistent Broker Teams
Brokers are used in many multi-agent systems for locating agents, for routing and sharing information, for managing the system, and for legal purposes, as independent third partie...
Sanjeev Kumar, Philip R. Cohen, Hector J. Levesque