This paper describes a new instruction-supply mechanism, called the eXtended Block Cache (XBC). The goal of the XBC is to improve on the Trace Cache (TC) hit rate, while providing...
Control independence has been put forward as a significant new source of instruction-level parallelism for future generation processors. However, its performance potential under p...
A structured approach to parallel programming allows to construct applications by composing skeletons, i.e., recurring patterns of task- and data-parallelism. First academic and co...
In this paper we present a processor microarchitecture that can simultaneously execute multiple threads and has a clustered design for scalability purposes. A main feature of the ...
Control intensive scalar programs pose a very different challenge to highly pipelined supercomputers than vectorizable numeric applications. Function call/return and branch instru...