Sciweavers

CF
2007
ACM
13 years 8 months ago
An analysis of the effects of miss clustering on the cost of a cache miss
In this paper we describe a new technique, called pipeline spectroscopy, and use it to measure the cost of each cache miss. The cost of a miss is displayed (graphed) as a histogra...
Thomas R. Puzak, Allan Hartstein, Philip G. Emma, ...
ICS
1992
Tsinghua U.
13 years 8 months ago
Optimizing for parallelism and data locality
Previous research has used program transformation to introduce parallelism and to exploit data locality. Unfortunately,these twoobjectives have usuallybeen considered independentl...
Ken Kennedy, Kathryn S. McKinley
PLDI
1997
ACM
13 years 8 months ago
Data-centric Multi-level Blocking
We present a simple and novel framework for generating blocked codes for high-performance machines with a memory hierarchy. Unlike traditional compiler techniques like tiling, whi...
Induprakas Kodukula, Nawaaz Ahmed, Keshav Pingali
DAC
1997
ACM
13 years 8 months ago
Remembrance of Things Past: Locality and Memory in BDDs
Binary Decision Diagrams BDDs are e cient at manipulating large sets in a compact manner. BDDs, however, are inefcient at utilizing the memory hierarchy of the computer. Recent ...
Srilatha Manne, Dirk Grunwald, Fabio Somenzi
LCPC
1999
Springer
13 years 8 months ago
Inter-array Data Regrouping
Abstract. As the speed gap between CPU and memory widens, memory hierarchy has become the performance bottleneck for most applications because of both the high latency and low band...
Chen Ding, Ken Kennedy
ICS
1999
Tsinghua U.
13 years 8 months ago
Improving memory hierarchy performance for irregular applications
The performance of irregular applications on modern computer systems is hurt by the wide gap between CPU and memory speeds because these applications typically underutilize multi-...
John M. Mellor-Crummey, David B. Whalley, Ken Kenn...
IPPS
1999
IEEE
13 years 8 months ago
The Impact of Memory Hierarchies on Cluster Computing
Using off-the-shelf commodity workstations and PCs to build a cluster for parallel computing has become a common practice. A choice of a cost-effective cluster computing platform ...
Xing Du, Xiaodong Zhang
IPPS
2000
IEEE
13 years 9 months ago
Predicting Performance on SMPs. A Case Study: The SGI Power Challenge
We study the issue of performance prediction on the SGIPower Challenge, a typical SMP. On such a platform, the cost of memory accesses depends on their locality and on contention ...
Nancy M. Amato, Jack Perdue, Mark M. Mathis, Andre...
ICS
2001
Tsinghua U.
13 years 9 months ago
Tools for application-oriented performance tuning
Application performance tuning is a complex process that requires assembling various types of information and correlating it with source code to pinpoint the causes of performance...
John M. Mellor-Crummey, Robert J. Fowler, David B....
ICPPW
2002
IEEE
13 years 9 months ago
Near-Optimal Loop Tiling by Means of Cache Miss Equations and Genetic Algorithms
The effectiveness of the memory hierarchy is critical for the performance of current processors. The performance of the memory hierarchy can be improved by means of program transf...
Jaume Abella, Antonio González, Josep Llosa...