Abstract. We present a relational algebra based framework for compiling e cient sparse matrix code from dense DO-ANY loops and a speci cation of the representation of the sparse ma...
level of abstraction, compared with the program representation for scalar optimizations. For example, loop unrolling and loop unrolland-jam transformations exploit the large regist...
Rakesh Krishnaiyer, Dattatraya Kulkarni, Daniel M....
The H.264 decoder has a sequential, control intensive front end that makes it difficult to leverage the potential performance of emerging manycore processors. Preparsing is a fun...
Jike Chong, Nadathur Satish, Bryan C. Catanzaro, K...
This paper focuses on generating efficient software pipelined schedules for in-order machines, which we call Converged Trace Schedules. For a candidate loop, we form a string of t...
Hyperthreaded (HT) and simultaneous multithreaded (SMT) processors are now available in commodity workstations and servers. This technology is designed to increase throughput by e...