Sciweavers

2811 search results - page 328 / 563
» Virtue: Performance Visualization of Parallel and Distribute...
Sort
View
131
Voted
IPPS
2008
IEEE
15 years 11 months ago
Lattice Boltzmann simulation optimization on leading multicore platforms
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
215
Voted
HPDC
2008
IEEE
15 years 11 months ago
Harmony: an execution model and runtime for heterogeneous many core systems
The emergence of heterogeneous many core architectures presents a unique opportunity for delivering order of magnitude performance increases to high performance applications by ma...
Gregory F. Diamos, Sudhakar Yalamanchili
IPPS
2002
IEEE
15 years 9 months ago
Hierarchical Interconnects for On-Chip Clustering
In the sub-micron technology era, wire delays are becoming much more important than gate delays, making it particularly attractive to go for clustered designs. A common form of cl...
Aneesh Aggarwal, Manoj Franklin
ICPPW
2005
IEEE
15 years 10 months ago
A Practical Approach to the Rating of Barrier Algorithms Using the LogP Model and Open MPI
Large–scale parallel applications performing global synchronization may spend a significant amount of execution time waiting for the completion of a barrier operation. Conseque...
Torsten Hoefler, Lavinio Cerquetti, Torsten Mehlan...
HPDC
2005
IEEE
15 years 10 months ago
Collective caching: application-aware client-side file caching
Parallel file subsystems in today’s high-performance computers adopt many I/O optimization strategies that were designed for distributed systems. These strategies, for instance...
Wei-keng Liao, Kenin Coloma, Alok N. Choudhary, Le...