High-performance out-of-order processors spend a significant portion of their execution time on the incorrect program path even though they employ aggressive branch prediction al...
Onur Mutlu, Hyesoon Kim, David N. Armstrong, Yale ...
In autonomous agent systems, memory is an important element to handle agent behaviors appropriately. We present the analysis of memory requirements for robotic tasks including wal...
Run-time parallelization is often the only way to execute the code in parallel when data dependence information is incomplete at compile time. This situation is common in many imp...
This paper proposes a cache hierarchy that enables Web search engines to efficiently process user queries. The different caches in the hierarchy are used to store pieces of data w...
During the last half-decade, a number of research efforts have centered around developing software for generating automatically tuned matrix multiplication kernels. These include ...
John A. Gunnels, Fred G. Gustavson, Greg Henry, Ro...