Abstract. In an interpreted execution there is an interdependence between the interpreter's execution and the interpreted application's execution; the implementation of t...
Traditional DMA requires the operating system to perform many tasks to initiate a transfer, with overhead on the order of hundreds or thousands of CPU instructions. This paper des...
Matthias A. Blumrich, Cezary Dubnicki, Edward W. F...
To expand the Scalable Coherent Interface's (SCI) capabilities so it can be used to efficiently handle sharing in systems of hundreds or even thousands of processors, the SCI...
For decades, the serialization constraints imposed by true data dependences have been regarded as an absolute limit--the dataflow limit--on the parallel execution of serial progra...
The advent of parallel executing Address Calculation Units (ACUs) in Digital Signal Processor (DSP) and Application Specific InstructionSet Processor (ASIP) architectures has made...
Clifford Liem, Pierre G. Paulin, Ahmed Amine Jerra...