Sciweavers

41 search results - page 1 / 9
» Recursion-driven parallel code generation for multi-core pla...
Sort
View
PDP
2008
IEEE
13 years 11 months ago
Scheduling of QR Factorization Algorithms on SMP and Multi-Core Architectures
This paper examines the scalable parallel implementation of QR factorization of a general matrix, targeting SMP and multi-core architectures. Two implementations of algorithms-by-...
Gregorio Quintana-Ortí, Enrique S. Quintana...
DATE
2010
IEEE
153views Hardware» more  DATE 2010»
13 years 9 months ago
Recursion-driven parallel code generation for multi-core platforms
—We present Huckleberry, a tool for automatically generating parallel implementations for multi-core platforms from sequential recursive divide-and-conquer programs. The recursiv...
Rebecca L. Collins, Bharadwaj Vellore, Luca P. Car...
HIPC
2009
Springer
13 years 2 months ago
A performance prediction model for the CUDA GPGPU platform
The significant growth in computational power of modern Graphics Processing Units(GPUs) coupled with the advent of general purpose programming environments like NVIDA's CUDA,...
Kishore Kothapalli, Rishabh Mukherjee, M. Suhail R...
IPPS
2008
IEEE
13 years 11 months ago
Lattice Boltzmann simulation optimization on leading multicore platforms
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
FCCM
2011
IEEE
331views VLSI» more  FCCM 2011»
12 years 8 months ago
Synthesis of Platform Architectures from OpenCL Programs
—The problem of automatically generating hardware modules from a high level representation of an application has been at the research forefront in the last few years. In this pap...
Muhsen Owaida, Nikolaos Bellas, Konstantis Dalouka...