Sciweavers

268 search results - page 43 / 54
» Analyzing Parallel Programs with Pin
Sort
View
HPDC
2007
IEEE
15 years 4 months ago
Feedback-directed thread scheduling with memory considerations
This paper describes a novel approach to generate an optimized schedule to run threads on distributed shared memory (DSM) systems. The approach relies upon a binary instrumentatio...
Fengguang Song, Shirley Moore, Jack Dongarra
ICPP
2003
IEEE
15 years 2 months ago
Scalable Implementations of MPI Atomicity for Concurrent Overlapping I/O
For concurrent I/O operations, atomicity defines the results in the overlapping file regions simultaneously read/written by requesting processes. Atomicity has been well studied...
Wei-keng Liao, Alok N. Choudhary, Kenin Coloma, Ge...
ICS
2001
Tsinghua U.
15 years 2 months ago
Tools for application-oriented performance tuning
Application performance tuning is a complex process that requires assembling various types of information and correlating it with source code to pinpoint the causes of performance...
John M. Mellor-Crummey, Robert J. Fowler, David B....
SC
2000
ACM
15 years 2 months ago
Performance Modeling and Tuning of an Unstructured Mesh CFD Application
This paper describes performance tuning experiences with a three-dimensional unstructured grid Euler flow code from NASA, which we have reimplemented in the PETSc framework and p...
William Gropp, Dinesh K. Kaushik, David E. Keyes, ...
IPPS
2010
IEEE
14 years 7 months ago
Dynamic analysis of the relay cache-coherence protocol for distributed transactional memory
Transactional memory is an alternative programming model for managing contention in accessing shared in-memory data objects. Distributed transactional memory (TM) promises to alle...
Bo Zhang, Binoy Ravindran