Dynamic predication has been proposed to reduce the branch misprediction penalty due to hard-to-predict branch instructions. A recently proposed dynamic predication architecture, ...
Current parallelizing compilers for message-passing machines only support a limited class of data-parallel applications. One method for eliminating this restriction is to combine ...
In this paper we investigate the benefit of scheduling non-critical loads for a higher latency during software pipelining. "Noncritical" denotes those loads that have s...
Sebastian Winkel, Rakesh Krishnaiyer, Robyn Sampso...
—We consider linear precoding and decoding in the downlink of a multiuser multiple-input, multiple-output (MIMO) system. In this scenario, the transmitter and the receivers may e...
Although SIMD (Single Instruction stream Multiple Data stream) parallel computers have existed for decades, it is only in the past few years that a new version of SIMD has evolved...