—The performance bottleneck for many scientific applications is the cost of memory access inside linear algebra kernels. Tuning such kernels for memory efficiency is a complex ...
Our study of a large set of scientific applications over the past three years indicates that the processing for multidimensional datasets is often highly stylized. The basic proce...
Chialin Chang, Renato Ferreira, Alan Sussman, Joel...
Effective overlap of computation and communication is a well understood technique for latency hiding and can yield significant performance gains for applications on high-end compu...
Aniruddha G. Shet, P. Sadayappan, David E. Bernhol...
Typical MPI applications work in phases of computation and communication, and messages are exchanged in relatively small chunks. This behavior is not optimal for TCP because TCP i...
: Context-aware ubiquitous computing environments tend to be highly distributed and heterogeneous, while also featuring increased dynamism as elements, devices and middleware compo...
John Soldatos, Kostas Stamatis, Siamak Azodolmolky...