Sciweavers

535 search results - page 6 / 107
» The Cache Performance and Optimizations of Blocked Algorithm...
Sort
View
JEA
2006
83views more  JEA 2006»
14 years 11 months ago
Cache-Friendly implementations of transitive closure
In this paper we show cache-friendly implementations of the Floyd-Warshall algorithm for the All-Pairs ShortestPath problem. We first compare the best commercial compiler optimiza...
Michael Penner, Viktor K. Prasanna
IPPS
1999
IEEE
15 years 3 months ago
Linear Aggressive Prefetching: A Way to Increase the Performance of Cooperative Caches
Cooperative caches offer huge amounts of caching memory that is not always used as well as it could be. We might find blocks in the cache that have not been requested for many hou...
Toni Cortes, Jesús Labarta
IWNAS
2008
IEEE
15 years 6 months ago
Software Barrier Performance on Dual Quad-Core Opterons
Multi-core processors based SMP servers have become building blocks for Linux clusters in recent years because they can deliver better performance for multi-threaded programs thro...
Jie Chen, William A. Watson III
ISCA
2007
IEEE
143views Hardware» more  ISCA 2007»
15 years 6 months ago
Interconnect design considerations for large NUCA caches
The ever increasing sizes of on-chip caches and the growing domination of wire delay necessitate significant changes to cache hierarchy design methodologies. Many recent proposal...
Naveen Muralimanohar, Rajeev Balasubramonian
106
Voted
ICS
1999
Tsinghua U.
15 years 3 months ago
Software trace cache
—This paper explores the use of compiler optimizations which optimize the layout of instructions in memory. The target is to enable the code to make better use of the underlying ...
Alex Ramírez, Josep-Lluis Larriba-Pey, Carl...