This paper explores hierarchical instruction scheduling for a tiled processor. Our results show that at the top level of the hierarchy, a simple profile-driven algorithm effective...
Martha Mercaldi, Steven Swanson, Andrew Petersen, ...
When dense matrix computations are too large to fit in cache, previous research proposes tiling to reduce or eliminate capacity misses. This paper presents a new algorithm for ch...
Parallel programming is facilitated by constructs which, unlike the widely used SPMD paradigm, provide programmers with a global view of the code and data structures. These constr...
Jia Guo, Ganesh Bikshandi, Daniel Hoeflinger, Gheo...
The effectiveness of the memory hierarchy is critical for the performance of current processors. The performance of the memory hierarchy can be improved by means of program transf...
We present a language model consisting of a collection of costed bidirectional finite state automata associated with the head words of phrases. The model is suitable for increment...