Search Sciweavers | Sciweavers

1113 search results - page 3 / 223

» Performance under Failures of DAG-based Parallel Computing

click to vote

TC
2008

146views Information Technology» more TC 2008»

Adaptive Fault Management of Parallel Applications for High-Performance Computing

13 years 5 months ago

Download www.cs.iit.edu

As the scale of high-performance computing (HPC) continues to grow, failure resilience of parallel applications becomes crucial. In this paper, we present FT-Pro, an adaptive fault...

Zhiling Lan, Yawei Li

claim paper

Read More »

click to vote

ICDCS
2006
IEEE

96views Distributed And Parallel Com...» more ICDCS 2006»

Load Unbalancing to Improve Performance under Autocorrelated Traffic

13 years 11 months ago

Download www.cs.wm.edu

Qi Zhang, Ningfang Mi, Alma Riska, Evgenia Smirni

claim paper

Read More »

click to vote

IPPS
2006
IEEE

179views Distributed And Parallel Com...» more IPPS 2006»

Load balancing in the presence of random node failure and recovery

13 years 11 months ago

Download www.cecs.uci.edu

In many distributed computing systems that are prone to either induced or spontaneous node failures, the number of available computing resources is dynamically changing in a rando...

Sagar Dhakal, Majeed M. Hayat, Jorge E. Pezoa, Cha...

claim paper

Read More »

click to vote

IPPS
1999
IEEE

155views Distributed And Parallel Com...» more IPPS 1999»

Condition-Based Maintenance: Algorithms and Applications for Embedded High Performance Computing

13 years 9 months ago

Download ipdps.cc.gatech.edu

Condition based maintenance (CBM) seeks to generate a design for a new ship wide CMB system that performs diagnoses and failure prediction on Navy shipboard machinery. Eventually, ...

Bonnie Holte Bennett, George D. Hadden

claim paper

Read More »

click to vote

IPPS
2005
IEEE

132views Distributed And Parallel Com...» more IPPS 2005»

Performance Implications of Periodic Checkpointing on Large-Scale Cluster Systems

13 years 11 months ago

Download adam.oliner.net

Large-scale systems like BlueGene/L are susceptible to a number of software and hardware failures that can affect system performance. Periodic application checkpointing is a commo...

Adam J. Oliner, Ramendra K. Sahoo, José E. ...

claim paper

Read More »

« Prev « First page 3 / 223 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers