— The paper introduces a novel co-compiler and its “vertical” parallelization method, including a general model for co-operating host/accelerator platforms and a new parallel...
For highest performance, a modern microprocessor must be able to determine if an instruction is ready in the same cycle in which it is to be selected for execution. This creates a...
Data communications between producer instructions and consumer instructions through memory incur extra delays that degrade processor performance. In this paper, we introduce a new...
In SMT processors, the complex interplay between private and shared datapath resources needs to be considered in order to realize the full performance potential. In this paper, we...
This paper proposes a new hardware technique for using one core of a CMP to prefetch data for a thread running on another core. Our approach simply executes a copy of all non-cont...