Sciweavers

TOG
2012
230views Communications» more  TOG 2012»
11 years 7 months ago
Decoupling algorithms from schedules for easy optimization of image processing pipelines
Using existing programming tools, writing high-performance image processing code requires sacrificing readability, portability, and modularity. We argue that this is a consequenc...
Jonathan Ragan-Kelley, Andrew Adams, Sylvain Paris...
ARC
2012
Springer
280views Hardware» more  ARC 2012»
12 years 9 days ago
Scalable Memory Hierarchies for Embedded Manycore Systems
As the size of FPGA devices grows following Moore’s law, it becomes possible to put a complete manycore system onto a single FPGA chip. The centralized memory hierarchy on typica...
Sen Ma, Miaoqing Huang, Eugene Cartwright, David L...
PPOPP
2011
ACM
12 years 7 months ago
Programming the memory hierarchy revisited: supporting irregular parallelism in sequoia
We describe two novel constructs for programming parallel machines with multi-level memory hierarchies: call-up, which allows a child task to invoke computation on its parent, and...
Michael Bauer, John Clark, Eric Schkufza, Alex Aik...
ISCA
2011
IEEE
294views Hardware» more  ISCA 2011»
12 years 8 months ago
Moguls: a model to explore the memory hierarchy for bandwidth improvements
In recent years, the increasing number of processor cores and limited increases in main memory bandwidth have led to the problem of the bandwidth wall, where memory bandwidth is b...
Guangyu Sun, Christopher J. Hughes, Changkyu Kim, ...
EUROPAR
2010
Springer
13 years 5 months ago
Maestro: Data Orchestration and Tuning for OpenCL Devices
Abstract. As heterogeneous computing platforms become more prevalent, the programmer must account for complex memory hierarchies in addition to the difficulties of parallel program...
Kyle Spafford, Jeremy S. Meredith, Jeffrey S. Vett...
FOCS
1998
IEEE
13 years 8 months ago
Towards an Optimal Bit-Reversal Permutation Program
The speed of many computations is limited not by the number of arithmetic operations but by the time it takes to move and rearrange data in the increasingly complicated memory hie...
Larry Carter, Kang Su Gatlin
ICS
2007
Tsinghua U.
13 years 10 months ago
Adaptive Strassen's matrix multiplication
Strassen’s matrix multiplication (MM) has benefits with respect to any (highly tuned) implementations of MM because Strassen’s reduces the total number of operations. Strasse...
Paolo D'Alberto, Alexandru Nicolau