The shading processors in graphics hardware are becoming increasingly general-purpose. We test, through simulation and benchmarking, the potential performance impact of replacing ...
Thomas M. DuBois, Bryant Lee, Yi Wang, Marc Olano,...
The increasing availability of multi-core and multiprocessor architectures provides new opportunities for improving the performance of many computer simulations. Markov Chain Mont...
Jonathan M. R. Byrd, Stephen A. Jarvis, A. H. Bhal...
This paper discusses our experience with fine-grain synchronization for a variant of the preconditioned conjugate gradient method. This algorithm represents a large class of algo...
As the number of cores on a single-chip grows, scalable barrier synchronization becomes increasingly difficult to implement. In software implementations, such as the tournament ba...
With the advent of large scale heterogeneous environments, there is a need for matching and scheduling algorithms which can allow multiple DAG-structured applications to share the...