We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Contr...
In this paper we present a new form of revocable lock that streamlines the construction of higher level concurrency abstractions such as atomic multi-word heap updates. The key id...
Distributed Shared Memory (DSM) systems provide a logically shared memory over physically distributed memory to enable parallel computation on Networks of Workstations (NOWs). In ...
The memory consistency model supported by a multiprocessor directly affects its performance. Thus, several attempts have been made to relax the consistency models to allow for mor...
Kourosh Gharachorloo, Anoop Gupta, John L. Henness...
Accelerated graphics cards, or Graphics Processing Units (GPUs), have become ubiquitous in recent years. On the right kinds of problems, GPUs greatly surpass CPUs in terms of raw ...