Cache behavior modeling is an important part of modern optimizing compilers. In this paper we present a method to estimate the number of cache misses, at compile time, using a mac...
Abstract Image registration is a computationally intensive application in the medical imaging domain that places stringent requirements on performance and memory management efficie...
Mainak Sen, Yashwanth Hemaraj, William Plishker, R...
Silicon technology will continue to provide an exponential increase in the availability of raw transistors. Effectively translating this resource into application performance, how...
Steven Swanson, Ken Michelson, Andrew Schwerin, Ma...
Programming specialized network processors (NPU) is inherently difficult. Unlike mainstream processors where architectural features such as out-of-order execution and caches hide ...
Abstract--Multi-core processors with accelerators are becoming commodity components for high-performance computing at scale. While accelerator-based processors have been studied in...
M. Mustafa Rafique, Ali Raza Butt, Dimitrios S. Ni...