Sciweavers

2 search results - page 1 / 1
» Using checkpointing to recover from poor multi-site parallel...
Sort
View
MIDDLEWARE
2007
Springer
13 years 10 months ago
Using checkpointing to recover from poor multi-site parallel job scheduling decisions
Recent research in multi-site parallel job scheduling leverages user-provided estimates of job communication characteristics to effectively partition the job across multiple clus...
William M. Jones
CLUSTER
2004
IEEE
13 years 8 months ago
A client-centric grid knowledgebase
Grid computing brings with it additional complexities and unexpected failures. Just keeping track of our jobs traversing different grid resources before completion can at times be...
George Kola, Tevfik Kosar, Miron Livny