The paper addresses the problem of matching and scheduling of DAG-structured application to both minimize the makespan and maximize the robustness in a heterogeneous computing sys...
—Various studies have pointed out the debilitating effects of OS Jitter on the performance of parallel applications on large clusters such as the ASCI Purple and the Mare Nostrum...
Improving memory performance at software level is more effective in reducing the rapidly expanding gap between processor and memory performance. Loop transformations (e.g. loop un...
Surendra Byna, Xian-He Sun, William Gropp, Rajeev ...
In this paper, we present the design and implementation of a distributed sensor network application for embedded, isolated-word, real-time speech recognition. In our system design...
Chung-Ching Shen, William Plishker, Shuvra S. Bhat...
We propose a novel work partitioning technique, Image Layer Decomposition (ILD), designed specifically to support distributed real-time rendering on commodity clusters. ILD has s...