This work provides a systematic study of the impact of communication performance on parallelapplications in a high performance network of workstations. We develop an experimental ...
Richard P. Martin, Amin Vahdat, David E. Culler, T...
The current trend is for processors to deliver dramatic improvements in parallel performance while only modestly improving serial performance. Parallel performance is harvested th...
Sanjeev Kumar, Daehyun Kim, Mikhail Smelyanskiy, Y...
In this paper, we propose tree structure lossless coding, by which compression data are arranged in a tree structure. Current compression methods show improved performance by prod...
Improving program locality has become increasingly important on modern computer systems. An effective strategy is to group computations on the same data so that once the data are ...
This paper presents techniques for compiling loops with complex, indirect array accesses into loops whose array references have at most one level of indirection. The transformatio...