Until recently, a steadily rising clock rate and other uniprocessor microarchitectural improvements could be relied upon to consistently deliver increasing performance for a wide ...
Guilherme Ottoni, Ram Rangan, Adam Stoler, David I...
Map-Reduce is a programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. Through ...
Hung-chih Yang, Ali Dasdan, Ruey-Lung Hsiao, Dougl...
Single-dimension Software Pipelining (SSP) has been proposed as an effective software pipelining technique for multi-dimensional loops [16]. This paper introduces for the first tim...
Open multithreaded transactions constitute building blocks that allow a developer to design and structure the execution of complex distributed systems featuring cooperative and co...
We present a novel approach to ray tracing execution on commodity graphics hardware using CUDA. We decompose
a standard ray tracing algorithm into several data-parallel stages tha...