Sciweavers

183 search results - page 1 / 37
» Dynamic Control of Performance Monitoring on Large Scale Par...
Sort
View
ICS
1993
Tsinghua U.
13 years 8 months ago
Dynamic Control of Performance Monitoring on Large Scale Parallel Systems
Performance monitoring of large scale parallel computers creates a dilemma: we need to collect detailed information to find performance bottlenecks, yet collecting all this data ...
Jeffrey K. Hollingsworth, Barton P. Miller
IPPS
2005
IEEE
13 years 10 months ago
Monitoring and Debugging Parallel Software with BCS-MPI on Large-Scale Clusters
Buffered CoScheduled (BCS) MPI is a novel implementation of MPI based on global synchronization of all system activities. BCS-MPI imposes a model where all processes and their com...
Juan Fernández, Fabrizio Petrini, Eitan Fra...
ICDCS
2009
IEEE
14 years 1 months ago
REMO: Resource-Aware Application State Monitoring for Large-Scale Distributed Systems
To observe, analyze and control large scale distributed systems and the applications hosted on them, there is an increasing need to continuously monitor performance attributes of ...
Shicong Meng, Srinivas R. Kashyap, Chitra Venkatra...
CCGRID
2006
IEEE
13 years 8 months ago
IPMI-based Efficient Notification Framework for Large Scale Cluster Computing
The demand for an efficient fault tolerance system has led to the development of complex monitoring infrastructure, which in turn has created an overwhelming task of data and even...
Chokchai Leangsuksun, Tirumala Rao, Anand Tikoteka...
SIGMETRICS
1996
ACM
118views Hardware» more  SIGMETRICS 1996»
13 years 8 months ago
Integrating Performance Monitoring and Communication in Parallel Computers
A large and increasing gap exists between processor and memory speeds in scalable cache-coherent multiprocessors. To cope with this situation, programmers and compiler writers mus...
Margaret Martonosi, David Ofelt, Mark Heinrich