Sciweavers

58 search results - page 2 / 12
» A global operating system for HPC clusters
Sort
View
CCGRID
2006
IEEE
13 years 11 months ago
Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation
With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...
Yuan Tang, Graham E. Fagg, Jack Dongarra
ICDCS
2012
IEEE
11 years 8 months ago
Combining Partial Redundancy and Checkpointing for HPC
Today’s largest High Performance Computing (HPC) systems exceed one Petaflops (1015 floating point operations per second) and exascale systems are projected within seven years...
James Elliott, Kishor Kharbas, David Fiala, Frank ...
SBACPAD
2008
IEEE
127views Hardware» more  SBACPAD 2008»
13 years 12 months ago
Measuring Operating System Overhead on CMT Processors
Numerous studies have shown that Operating System (OS) noise is one of the reasons for significant performance degradation in clustered architectures. Although many studies exami...
Petar Radojkovic, Vladimir Cakarevic, Javier Verd&...
CLUSTER
2007
IEEE
13 years 9 months ago
A feasibility analysis of power-awareness and energy minimization in modern interconnects for high-performance computing
High-performance computing (HPC) systems consume a significant amount of power, resulting in high operational costs, reduced reliability, and wasting of natural resources. Therefor...
Reza Zamani, Ahmad Afsahi, Ying Qian, V. Carl Hama...
SPE
2010
114views more  SPE 2010»
13 years 3 months ago
A survey of the research on power management techniques for high-performance systems
This paper surveys the research on power management techniques for high performance systems. These include both commercial high performance clusters and scientific high performanc...
Yongpeng Liu, Hong Zhu