Sciweavers

45 search results - page 6 / 9
» Cache Performance Optimizations for Parallel Lattice Boltzma...
Sort
View
ICPP
1999
IEEE
15 years 1 months ago
Improving Performance of Load-Store Sequences for Transaction Processing Workloads on Multiprocessors
On-line transaction processing exhibits poor memory behavior in high-end multiprocessor servers because of complex sharing patterns and substantial interaction between the databas...
Jim Nilsson, Fredrik Dahlgren
IEEEPACT
2009
IEEE
14 years 7 months ago
Region Based Structure Layout Optimization by Selective Data Copying
As the gap between processor and memory continues to grow, memory performance becomes a key performance bottleneck for many applications. Compilers therefore increasingly seek to m...
Sandya S. Mannarswamy, Ramaswamy Govindarajan, Ris...
IPPS
2010
IEEE
14 years 7 months ago
Servet: A benchmark suite for autotuning on multicore clusters
Abstract--The growing complexity in computer system hierarchies due to the increase in the number of cores per processor, levels of cache (some of them shared) and the number of pr...
Jorge González-Domínguez, Guillermo ...
104
Voted
IPPS
1998
IEEE
15 years 1 months ago
Memory Hierarchy Management for Iterative Graph Structures
The increasing gap in processor and memory speeds has forced microprocessors to rely on deep cache hierarchies to keep the processors from starving for data. For many applications...
Ibraheem Al-Furaih, Sanjay Ranka
EUROPAR
2003
Springer
15 years 2 months ago
Obtaining Hardware Performance Metrics for the BlueGene/L Supercomputer
Hardware performance monitoring is the basis of modern performance analysis tools for application optimization. We are interested in providing such performance analysis tools for t...
Pedro Mindlin, José R. Brunheroto, Luiz De ...