Application domain specific DSP cores are becoming increasingly popular due to their advantageous trade–off between flexibility and cost. However, existing code generation metho...
Adwin H. Timmer, Marino T. J. Strik, Jef L. van Me...
The VLIW processors with static instruction scheduling and thus deterministic execution times are very suitable for highperformance real-time DSP applications. But the two major w...
The foremost goal of superscalar processor design is to increase performance through the exploitation of instruction-level parallelism (ILP). Previous studies have shown that spec...
This paper focuses on generating efficient software pipelined schedules for in-order machines, which we call Converged Trace Schedules. For a candidate loop, we form a string of t...
Accelerating program performance via SIMD vector units is very common in modern processors, as evidenced by the use of SSE, MMX, VSE, and VSX SIMD instructions in multimedia, scien...