Abstract—Dynamic runtimes can simplify parallel programming by automatically managing concurrency and locality without further burdening the programmer. Nevertheless, implementin...
Richard M. Yoo, Anthony Romano, Christos Kozyrakis
The Cell BE is a multicore processor with eight vector accelerators (called SPEs) that implement explicit cache management through direct memory access engines. While the Cell has...
Srinivas Chellappa, Franz Franchetti, Markus P&uum...
Abstract—Load elimination is a classical compiler transformation that is increasing in importance for multi-core and many-core architectures. The effect of the transformation is ...
—The problem of automatically generating hardware modules from a high level representation of an application has been at the research forefront in the last few years. In this pap...
Heterogeneous multicores, such as Cell BE processors and GPGPUs, typically do not have caches for their accelerator cores because coherence traffic, cache misses, and latencies fr...