Sciweavers

CLUSTER
2004
IEEE
13 years 8 months ago
Improved message logging versus improved coordinated checkpointing for fault tolerant MPI
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
CLUSTER
2004
IEEE
13 years 8 months ago
RFS: efficient and flexible remote file access for MPI-IO
Scientific applications often need to access remote file systems. Because of slow networks and large data size, however, remote I/O can become an even more serious performance bot...
Jonghyun Lee, Robert B. Ross, Rajeev Thakur, Xiaos...
CLUSTER
2004
IEEE
13 years 8 months ago
A distributed data management middleware for data-driven application systems
A key challenge in supporting data-driven scientific applications is the storage and management of input and output data in a distributed environment. In this paper, we describe a...
Stephen Langella, Shannon Hastings, Scott Oster, T...
CLUSTER
2004
IEEE
13 years 8 months ago
JuxtaView - a tool for interactive visualization of large imagery on scalable tiled displays
JuxtaView is a cluster-based application for viewing ultra-high-resolution images on scalable tiled displays. We present in JuxtaView, a new parallel computing and distributed mem...
Naveen K. Krishnaprasad, Venkatram Vishwanath, Sha...
CLUSTER
2004
IEEE
13 years 8 months ago
A client-centric grid knowledgebase
Grid computing brings with it additional complexities and unexpected failures. Just keeping track of our jobs traversing different grid resources before completion can at times be...
George Kola, Tevfik Kosar, Miron Livny
CLUSTER
2004
IEEE
13 years 8 months ago
Cluster computing environment supporting single system image
Single system image(SSI) systems have been the mainstay of high-performance computing for many years. SSI requires the integration and aggregation of all types of resources in a c...
Min Choi, DaeWoo Lee, Seung Ryoul Maeng
CLUSTER
2004
IEEE
13 years 8 months ago
Bandwidth-aware co-allocating meta-schedulers for mini-grid architectures
The interaction of simultaneously co-allocated jobs can often create contention in the network infrastructure of a dedicated computational grid. This contention can lead to degrad...
William M. Jones, Louis W. Pang, Walter B. Ligon I...
CLUSTER
2004
IEEE
13 years 8 months ago
On optimizing collective communication
In this paper we discuss issues related to the highperformance implementation of collective communications operations on distributed-memory computer architectures. Using a combina...
E. W. Chan, M. F. Heimlich, Avi Purkayastha, Rober...