Sciweavers

45 search results - page 5 / 9
» Cache Performance Optimizations for Parallel Lattice Boltzma...
Sort
View
HPCA
2009
IEEE
15 years 10 months ago
Design and implementation of software-managed caches for multicores with local memory
Heterogeneous multicores, such as Cell BE processors and GPGPUs, typically do not have caches for their accelerator cores because coherence traffic, cache misses, and latencies fr...
Sangmin Seo, Jaejin Lee, Zehra Sura
CF
2010
ACM
15 years 2 months ago
Load balancing using dynamic cache allocation
Supercomputers need a huge budget to be built and maintained. To maximize the usage of their resources, application developers spend time to optimize the code of the parallel appl...
Miquel Moretó, Francisco J. Cazorla, Rizos ...
94
Voted
IPPS
2003
IEEE
15 years 2 months ago
ECO: An Empirical-Based Compilation and Optimization System
In this paper, we describe a compilation system that automates much of the process of performance tuning that is currently done manually by application programmers interested in h...
Nastaran Baradaran, Jacqueline Chame, Chun Chen, P...
91
Voted
IPPS
2007
IEEE
15 years 3 months ago
Memory Optimizations For Fast Power-Aware Sparse Computations
— We consider memory subsystem optimizations for improving the performance of sparse scientific computation while reducing the power consumed by the CPU and memory. We first co...
Konrad Malkowski, Padma Raghavan, Mary Jane Irwin
EUROPAR
2001
Springer
15 years 2 months ago
Load Redundancy Elimination on Executable Code
Optimizations performed at link time or directly applied to nal program executables have received increased attention in recent years. This paper discuss the discovery and elimina...
Manel Fernández, Roger Espasa, Saumya K. De...