To meet the performance demands of modern architectures, compilers incorporate an everincreasing number of aggressive code transformations. Since most of these transformations are...
Spyridon Triantafyllis, Manish Vachharajani, Neil ...
We present a number of optimization techniques to compute prefix sums on linked lists and implement them on multithreaded GPUs using CUDA. Prefix computations on linked structures ...
The memory subsystem of a complex multiprocessor systemson-chip (MPSoC) is an important contributor to the chip power consumption. The selection of memory architecture, as well as...
Multimedia processing on embedded devices requires an architecture that leads to high performance, low power consumption, reduced design complexity, and small code size. In this p...
As multicore systems become the dominant mainstream computing technology, one of the most difficult challenges the industry faces is the software. Applications with large amounts ...
Hongtao Zhong, Mojtaba Mehrara, Steven A. Lieberma...