Compiler optimizations are often driven by specific assumptions about the underlying architecture and implementation of the target machine. For example, when targeting shared-mem...
Jack L. Lo, Susan J. Eggers, Henry M. Levy, Sujay ...
Blocked-execution multiprocessor architectures continuously run atomic blocks of instructions — also called Chunks. Such architectures can boost both performance and software pr...
This paper presents several parallel FFT algorithms with different degree of communication overhead for multiprocessors in Network-on-Chip(NoC) environment. Three different method...
Active networks allow computations to be performed innetwork at routers as messages pass through them. Active networks offer unique opportunities to optimize networkcentric applic...
Parallel programs that use critical sections and are executed on a shared-memory multiprocessor with a writeinvalidate protocol result in invalidation actions that could be elimin...