Load balancing and data locality are the two most important factors in the performance of parallel programs on distributed-memory multiprocessors. A good balancing scheme should e...
Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. However, when using multiple GPUs concurrently, the conventional data parallel GPU...
Parallelization of sequential programs is often daunting because of the substantial development cost involved. Various solutions have been proposed to address this concern, includ...
Harnish Botadra, Qiong Cheng, Sushil K. Prasad, Er...
ions Sriram Krishnamoorthy½ Umit Catalyurek¾ Jarek Nieplocha¿ Atanas Rountev½ P. Sadayappan½ ½ Dept. of Computer Science and Engineering, ¾ Dept. of Biomedical Informatics T...
Information on the behavior of programs is essential for deciding the number and nature of functional units in high performance architectures. In this paper, we present studies on...
Lizy Kurian John, Vinod Reddy, Paul T. Hulina, Lee...