Sciweavers

CF
2006
ACM
13 years 6 months ago
Intermediately executed code is the key to find refactorings that improve temporal data locality
The growing speed gap between memory and processor makes an efficient use of the cache ever more important to reach high performance. One of the most important ways to improve cac...
Kristof Beyls, Erik H. D'Hollander
CF
2006
ACM
13 years 8 months ago
Performance characteristics of an adaptive mesh refinement calculation on scalar and vector platforms
Adaptive mesh refinement (AMR) is a powerful technique that reduces the resources necessary to solve otherwise intractable problems in computational science. The AMR strategy solv...
Michael L. Welcome, Charles A. Rendleman, Leonid O...
CF
2006
ACM
13 years 8 months ago
The potential of the cell processor for scientific computing
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. As...
Samuel Williams, John Shalf, Leonid Oliker, Shoaib...
CF
2006
ACM
13 years 8 months ago
An efficient cache design for scalable glueless shared-memory multiprocessors
Traditionally, cache coherence in large-scale shared-memory multiprocessors has been ensured by means of a distributed directory structure stored in main memory. In this way, the ...
Alberto Ros, Manuel E. Acacio, José M. Garc...
CF
2006
ACM
13 years 8 months ago
Landing openMP on cyclops-64: an efficient mapping of openMP to a many-core system-on-a-chip
This paper presents our experience mapping OpenMP parallel programming model to the IBM Cyclops-64 (C64) architecture. The C64 employs a many-core-on-a-chip design that integrates...
Juan del Cuvillo, Weirong Zhu, Guang R. Gao
CF
2006
ACM
13 years 10 months ago
Energy-aware data prefetching for multi-speed disks
Seung Woo Son, Mahmut T. Kandemir
CF
2006
ACM
13 years 10 months ago
Improving the memory behavior of vertical filtering in the discrete wavelet transform
The discrete wavelet transform (DWT) is used in several image and video compression standards, in particular JPEG2000. A 2D DWT consists of horizontal filtering along the rows fo...
Asadollah Shahbahrami, Ben H. H. Juurlink, Stamati...
CF
2006
ACM
13 years 10 months ago
Kilo-instruction processors, runahead and prefetching
There is a continuous research effort devoted to overcome the memory wall problem. Prefetching is one of the most frequently used techniques. A prefetch mechanism anticipates the ...
Tanausú Ramírez, Alex Pajuelo, Olive...