Sciweavers

244 search results - page 41 / 49
» Basic Compiler Algorithms for Parallel Programs
Sort
View
ARCS
2008
Springer
15 years 1 months ago
An Optimized ZGEMM Implementation for the Cell BE
: The architecture of the IBM Cell BE processor represents a new approach for designing CPUs. The fast execution of legacy software has to stand back in order to achieve very high ...
Timo Schneider, Torsten Hoefler, Simon Wunderlich,...
CASCON
1994
123views Education» more  CASCON 1994»
15 years 1 months ago
Integrating real-time and partial-order information in event-data displays
The events occurring in the execution of a distributed or parallel application are related by a partial, rather than a total, order. We have developed prototype software that coll...
David J. Taylor, Michael H. Coffin
PLDI
2005
ACM
15 years 5 months ago
Demystifying on-the-fly spill code
Modulo scheduling is an effective code generation technique that exploits the parallelism in program loops by overlapping iterations. One drawback of this optimization is that reg...
Alex Aletà, Josep M. Codina, Antonio Gonz&a...
PLDI
2005
ACM
15 years 5 months ago
Register allocation for software pipelined multi-dimensional loops
Software pipelining of a multi-dimensional loop is an important optimization that overlaps the execution of successive outermost loop iterations to explore instruction-level paral...
Hongbo Rong, Alban Douillet, Guang R. Gao
IEEECIT
2010
IEEE
14 years 10 months ago
Superblock-Based Source Code Optimizations for WCET Reduction
—Superblocks represent regions in a program code that consist of multiple basic blocks. Compilers benefit from this structure since it enables optimization across block boundari...
Paul Lokuciejewski, Timon Kelter, Peter Marwedel