Sciweavers

4495 search results - page 1 / 899
» A Performance Monitoring System for Large Computing Clusters
Sort
View
CCGRID
2006
IEEE
13 years 8 months ago
IPMI-based Efficient Notification Framework for Large Scale Cluster Computing
The demand for an efficient fault tolerance system has led to the development of complex monitoring infrastructure, which in turn has created an overwhelming task of data and even...
Chokchai Leangsuksun, Tirumala Rao, Anand Tikoteka...
IPPS
2005
IEEE
13 years 10 months ago
Monitoring and Debugging Parallel Software with BCS-MPI on Large-Scale Clusters
Buffered CoScheduled (BCS) MPI is a novel implementation of MPI based on global synchronization of all system activities. BCS-MPI imposes a model where all processes and their com...
Juan Fernández, Fabrizio Petrini, Eitan Fra...
ICS
1993
Tsinghua U.
13 years 8 months ago
Dynamic Control of Performance Monitoring on Large Scale Parallel Systems
Performance monitoring of large scale parallel computers creates a dilemma: we need to collect detailed information to find performance bottlenecks, yet collecting all this data ...
Jeffrey K. Hollingsworth, Barton P. Miller