This paper considers resource allocation algorithms for processing streams of events on computational grids. For example, financial trading applications are executed on large comp...
Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...
NAMD† is a portable parallel application for biomolecular simulations. NAMD pioneered the use of hybrid spatial and force decomposition, a technique now used by most scalable pr...
Abhinav Bhatele, Sameer Kumar, Chao Mei, James C. ...
Path profiles provide a more accurate characterization of a program's dynamic behavior than basic block or edge profiles, but are relatively more expensive to collect. This h...
Kapil Vaswani, Aditya V. Nori, Trishul M. Chilimbi
Process arrival pattern, which denotes the timing when different processes arrive at an MPI collective operation, can have a significant impact on the performance of the operatio...