High-accuracy PDE solvers use multi-dimensional fast Fourier transforms. The FFTs exhibits a static and structured memory access pattern which results in a large amount of communic...
We have designed a maximum likelihood fitter using the actor model to distribute the computation over a heterogeneous network. The prototype implementation uses the SALSA program...
Wei-Jen Wang, Kaoutar El Maghraoui, John Cummings,...
The effectiveness of the memory hierarchy is critical for the performance of current processors. The performance of the memory hierarchy can be improved by means of program transf...
In this paper, we report on the performance of the remote procedure call implementation for the Firefly multiprocessor and analyze the implementation to account precisely for all ...
—Since many embedded systems execute a predefined set of programs, tuning system components to application programs and data is the approach chosen by many design techniques to o...