As high-end computing systems continue to grow in scale, the performance that applications can achieve on such large scale systems depends heavily on their ability to avoid explic...
Gopalakrishnan Santhanaraman, Pavan Balaji, K. Gop...
Automatic management of large-scale production systems requires a continuous monitoring service to keep track of the states of the managed system. However, it is challenging to ac...
—Clusters and applications continue to grow in size while their mean time between failure (MTBF) is getting smaller. Checkpoint/Restart is becoming increasingly important for lar...
When investigating the performance of running scientific/ commercial workflows in parallel and distributed systems, we often take into account only the resources allocated to the ...
Ligang He, Mark Calleja, Mark Hayes, Stephen A. Ja...
While current search engines serve known-item search such as homepage finding very well, they generally cannot support exploratory search effectively. In exploratory search, user...
Xuanhui Wang, Bin Tan, Azadeh Shakery, ChengXiang ...