Sciweavers

221 search results - page 40 / 45
» Scheduling instruction effects for a statically pipelined pr...
Sort
View
ASPLOS
2004
ACM
15 years 3 months ago
Compiler orchestrated prefetching via speculation and predication
This paper introduces a compiler-orchestrated prefetching system as a unified framework geared toward ameliorating the gap between processing speeds and memory access latencies. ...
Rodric M. Rabbah, Hariharan Sandanagobalane, Mongk...
LCTRTS
2005
Springer
15 years 3 months ago
Cache aware optimization of stream programs
Effective use of the memory hierarchy is critical for achieving high performance on embedded systems. We focus on the class of streaming applications, which is increasingly preval...
Janis Sermulins, William Thies, Rodric M. Rabbah, ...
CASES
2006
ACM
15 years 2 months ago
Reaching fast code faster: using modeling for efficient software thread integration on a VLIW DSP
When integrating software threads together to boost performance on a processor with instruction-level parallel processing support, it is rarely clear which code regions should be ...
Won So, Alexander G. Dean
PLDI
2005
ACM
15 years 3 months ago
Demystifying on-the-fly spill code
Modulo scheduling is an effective code generation technique that exploits the parallelism in program loops by overlapping iterations. One drawback of this optimization is that reg...
Alex Aletà, Josep M. Codina, Antonio Gonz&a...
DAC
2002
ACM
15 years 11 months ago
Design of a high-throughput low-power IS95 Viterbi decoder
The design of high-throughput large-state Viterbi decoders relies on the use of multiple arithmetic units. The global communication channels among these parallel processors often ...
Xun Liu, Marios C. Papaefthymiou