This paper presents the evaluation of a non-blocking, decoupled memory/execution, multithreaded architecture known as the Scheduled Dataflow (SDF). The major recent trend in digit...
Instruction aggregation—the grouping of multiple operations into a single processing unit—is a technique that has recently been used to amplify the bandwidth and capacity of c...
The run-time performance of VLIW (very long instruction word) microprocessors depends heavily on the effectiveness of its associated optimizing compiler. Typical VLIW compiler pha...
Cache affinity between a process and a processor is observed when the processor cache has accumulated some amount of the process state, i.e., data or instructions. Cache affinity i...
Vahid Kazempour, Alexandra Fedorova, Pouya Alagheb...
Modern processors perform dynamic scheduling to achieve better utilization of execution resources. A schedule created at run-time is often better than one created at compile-time ...
Francesco Spadini, Brian Fahs, Sanjay J. Patel, St...