High-Performance clusters are rapidly becoming an important computing platform for both scientific and business applications. To fulfill the new demands and challenges, cluster sy...
Zhihong Zhang, Dan Meng, Jianfeng Zhan, Lei Wang, ...
This paper presents a comparative study of the communication middleware systems suitable for aggregating computational clusters with heterogeneous incompatible SANs into a common ...
This paper describes an object-oriented software architecture for cluster integration and management that enables extensibility, portability, and scalability. This architecture ha...
James H. Laros III, Lee Ward, Nathan W. Dauchy, Ro...
We describe the communication infrastructure (CI) for our fault-tolerant cluster middleware, which is optimized for two classes of communication: for the applications and for the ...
Ming Li, Wenchao Tao, Daniel Goldberg, Israel Hsu,...
Supermon is a flexible set of tools for high speed, scalable cluster monitoring. Node behavior can be monitored much faster than with other commonly used methods (e.g., rstatd). ...