—Analytical models have been used to estimate optimal values for parameters such as tile sizes in the context of loop nests. However, important algorithms such as fast Fourier tr...
Basilio B. Fraguela, Yevgen Voronenko, Markus P&uu...
The floating-point multiply-add fused (MAF) unit sets a new trend in the processor design to speed up floatingpoint performance in scientific and multimedia applications. This ...
Content addressable memory (CAM) is widely used in many applications that require fast table lookup. Due to the parallel comparison feature and high frequency of lookup, however, ...
Thread migration moves a single call-stack to another machine to improve either load balancing or locality. Current approaches for checkpointing and thread migration are either no...
Multiprocessor memory reference traces provide a wealth of information on the behavior of parallel programs. We have used this information to explore the relationship between kern...
William J. Bolosky, Michael L. Scott, Robert P. Fi...