Most of the prediction mechanisms predict a single path to continue the execution on a branch. Alternatively, we may exploit parallelism from either possible paths of a branch, di...
Global locality analysis is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout optimizations. Pure loop tran...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
In this paper, we present efficient methods for multidimensional array redistribution. Based on the previous work, the basic-cycle calculation technique, we present a basic-block ...
This paper presents performance results for the design and implementation of parallel pipelined Space-Time Adaptive Processing (STAP) algorithms on parallel computers. In particul...
Alok N. Choudhary, Wei-keng Liao, Donald Weiner, P...
Rendering high-quality computer animations requires intensive computation, and therefore a large amount of time. One way to speed up this process is to devise rendering algorithms...