Software pipelining is a critical optimization for producing efficient code for VLIW/EPIC and superscalar processors in highperformance embedded applications such as digital sign...
Predicated execution can eliminate hard to predict branches and help to enable instruction level parallelism. Many current predication variants exist where the result update is co...
Data caches are essential in modern processors, bridging the widening gap between main memory and processor speeds. However, they yield very complex performance models, which make...
This paper discusses the implementation of a numerical algorithm for simulating incompressible fluid flows based on the finite difference method and designed for parallel compu...
In this paper, we investigate methods for improving the hit rates in the first level of memory hierarchy. Particularly, we propose victim cache structures to reduce the number of ...
Gokhan Memik, Glenn Reinman, William H. Mangione-S...