Abstract This paper proposes a new fine-grained data distribution operation MPI Alltoall specific that allows an element-wise distribution of data elements to specific target pro...
This paper presents an eco-friendly daemon that reduces power and energy consumption while better maintaining high performance via an accurate workload characterization that infer...
Though shared virtual memory (SVM) systems promise low cost solutions for high performance computing, they suffer from long memory latencies. These latencies are usually caused by...
Many performance problems observed in high end systems are actually caused by the runtime system and not the application code. Detecting these cases will require parallel performa...
Rashawn L. Knapp, Karen L. Karavanic, Douglas M. P...
Collective communication is very useful for parallel applications, especially those in which matrix and vector data structures need to be manipulated by a group of processes. This...
Rafael Ennes Silva, Delcino Picinin, Marcos E. Bar...