This paper proposes a simple and efficient implementation method for a hierarchical coarse grain task parallel processing scheme on a SMP machine. OSCAR multigrain parallelizing c...
Understanding why the performance of a multithreaded program does not improve linearly with the number of cores in a sharedmemory node populated with one or more multicore process...
Recent studies of irregular applications such as finite-element mesh generators and data-clustering codes have shown that these applications have a generalized data parallelism ar...
Advances in VLSI technology will enable chips with over a billion transistors within the next decade. Unfortunately, the centralized-resource architectures of modern microprocesso...
Walter Lee, Rajeev Barua, Matthew Frank, Devabhakt...
A universal spatial automaton, called WAVE, for highly parallel processing in arbitrary distributed systems is described. The automaton is based on a virus principle where recursi...