Cray X1 Fortran and C/C++ compilers provide a number of loop transformations, notably vectorization and multistreaming, in order to exploit the multistreaming processor (MSP) hard...
With the growth in the computation capacity of sensor nodes, they are increasingly equipped to handle more complex functions. Moreover, the need to realize the complete loop of se...
In the frame of a Unified Messaging System, a crucial task of the system is to provide the user with key information on every message received, like keywords reflecting the object...
Data access usually leads to more than 50% of the power cost in a modern signal processing system. To realize a low-power design, how to reduce the memory access power is a critica...
We present a novel approach to ray tracing execution on commodity graphics hardware using CUDA. We decompose
a standard ray tracing algorithm into several data-parallel stages tha...