Reducing the arithmetic precision of a computation has real performance implications, including increased speed, decreased power consumption, and a smaller memory footprint. For s...
Michael D. Linderman, Matthew Ho, David L. Dill, T...
As the amount of available silicon resources on one chip increases, we have seen the advent of ever increasing parallel resources integrated on-chip. Many architectures use these ...
In this work, we present several compiler optimizations to reduce the overhead due to software protection. We first propose an aggressive rematerialization algorithm which attempt...
Vector-thread (VT) architectures exploit multiple forms of parallelism simultaneously. This paper describes a compiler for the Scale VT architecture, which takes advantage of the ...
We propose a new instruction, branch-on-random, that is like a standard conditional branch, except rather than specifying the condition on which the branch should be taken, it spe...