—Thread-level parallelism (TLP) has been extensively studied in order to overcome the limitations of exploiting instruction-level parallelism (ILP) on high-performance superscala...
In this paper we present a novel processor microarchitecture that relieves three of the most important bottlenecks of superscalar processors: the serialization imposed by true dep...
We show empirically that some of the issues that affected the design of linear algebra libraries for distributed memory architectures will also likely affect such libraries for s...
Bryan Marker, Field G. Van Zee, Kazushige Goto, Gr...
Abstract. Harness is a reconÞgurable, heterogeneous distributed metacomputing framework for the dynamic conÞguration of distributed virtual machines, through the use of parallel ...
Wael R. Elwasif, David E. Bernholdt, James Arthur ...
Programs with irregular patterns of dynamic data structures and/or those with complicated control structures such as recursion are notoriously difficult to parallelize efficient...
Sheng Li, Amit Kashyap, Shannon K. Kuntz, Jay B. B...