Abstract--We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tr...
James Dinan, Arjun Singri, P. Sadayappan, Sriram K...
Applications augmented with adaptive capabilities are becoming common in parallel computing environments which share resources such as main memory, network, or disk I/O. For large...
Nurzhan Ustemirov, Masha Sosonkina, Mark S. Gordon...