Sciweavers

302 search results - page 47 / 61
» Cluster Computing: A High-Performance Contender
Sort
View
HIPC
2007
Springer
15 years 3 months ago
A Scalable Asynchronous Replication-Based Strategy for Fault Tolerant MPI Applications
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...
John Paul Walters, Vipin Chaudhary
CLUSTER
2002
IEEE
14 years 9 months ago
Online Prediction of the Running Time of Tasks
Abstract. We describe and evaluate the Running Time Advisor (RTA), a system that can predict the running time of a compute-bound task on a typical shared, unreserved commodity host...
Peter A. Dinda
CLUSTER
2004
IEEE
15 years 1 months ago
Improved message logging versus improved coordinated checkpointing for fault tolerant MPI
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
ISMS
2004
Springer
15 years 2 months ago
An Interactive Parallel Multigrid FEM Simulator
Physically based modeling of deformable objects such as cloth or human tissue has grown to be very important for virtual simulations. However, interactive simulation of these nonl...
Xunlei Wu, Tolga Goktekin, Frank Tendick
HPDC
2010
IEEE
14 years 10 months ago
Comparison of resource platform selection approaches for scientific workflows
Cloud computing is increasingly considered as an additional computational resource platform for scientific workflows. The cloud offers opportunity to scale-out applications from d...
Yogesh L. Simmhan, Lavanya Ramakrishnan