As DRAM access latencies approach a thousand instructionexecution times and on-chip caches grow to multiple megabytes, it is not clear that conventional cache structures continue ...
Communication in cache-coherent distributed shared memory (DSM) often requires invalidating (or writing back) cached copies of a memory block, incurring high overheads. This paper...
Smaller feature sizes, reduced voltage levels, higher transistor counts, and reduced noise margins make future generations of microprocessors increasingly prone to transient hardw...
Reconfigurable hardware has the potential for significant performance improvements by providing support for applicationāspecific operations. We report our experience with Chimae...
Zhi Alex Ye, Andreas Moshovos, Scott Hauck, Prithv...
High performance general-purpose processors are increasingly being used for a variety of application domains scienti c, engineering, databases, and more recently, media processing...
Parthasarathy Ranganathan, Sarita V. Adve, Norman ...