Sciweavers

2703 search results - page 374 / 541
» Optimizing memory transactions
Sort
View
ICCD
2004
IEEE
106views Hardware» more  ICCD 2004»
16 years 16 days ago
Energy Characterization of Hardware-Based Data Prefetching
This paper evaluates several hardware-based data prefetching techniques from an energy perspective, and explores their energy/performance tradeoffs. We present detailed simulation...
Yao Guo, Saurabh Chheda, Israel Koren, C. Mani Kri...
121
Voted
IPPS
2009
IEEE
15 years 10 months ago
Exploring the effect of block shapes on the performance of sparse kernels
In this paper we explore the impact of the block shape on blocked and vectorized versions of the Sparse Matrix-Vector Multiplication (SpMV) kernel and build upon previous work by ...
Vasileios Karakasis, Georgios I. Goumas, Nectarios...
162
Voted
IPPS
2009
IEEE
15 years 10 months ago
Designing multi-leader-based Allgather algorithms for multi-core clusters
The increasing demand for computational cycles is being met by the use of multi-core processors. Having large number of cores per node necessitates multi-core aware designs to ext...
Krishna Chaitanya Kandalla, Hari Subramoni, Gopala...
EUROPAR
2009
Springer
15 years 10 months ago
PSPIKE: A Parallel Hybrid Sparse Linear System Solver
The availability of large-scale computing platforms comprised of tens of thousands of multicore processors motivates the need for the next generation of highly scalable sparse line...
Murat Manguoglu, Ahmed H. Sameh, Olaf Schenk
148
Voted
ICMCS
2008
IEEE
208views Multimedia» more  ICMCS 2008»
15 years 10 months ago
Fast computation of general Fourier Transforms on GPUS
We present an implementation of general FFTs for graphics processing units (GPUs). Unlike most existing GPU FFT implementations, we handle both complex and real data of any size t...
Brandon Lloyd, Chas Boyd, Naga K. Govindaraju