Dynamic binary translation systems enable a wide range of applications such as program instrumentation, optimization, and security. DBTs use a software code cache to store previou...
The MPI programming model hides network type and topology from developers, but also allows them to seamlessly distribute a computational job across multiple cores in both an intra ...
Set-associative caches are traditionally managed using hardwarebased lookup and replacement schemes that have high energy overheads. Ideally, the caching strategy should be tailor...
Rajiv A. Ravindran, Michael L. Chu, Scott A. Mahlk...
The growing processor/memory performance gap causes the performance of many codes to be limited by memory accesses. If known to exist in an application, strided memory accesses fo...
Tushar Mohan, Bronis R. de Supinski, Sally A. McKe...
Abstract - Overlay multicast, which performs topology construction and data relaying in the application layer, has recently emerged as a promising vehicle for data distribution. In...