Sciweavers

2918 search results - page 1 / 584
» Distributed Troubleshooting Agents
Sort
View
ICAC
2005
IEEE
13 years 11 months ago
Distributed Troubleshooting Agents
Key issues to address in autonomic job recovery for cluster computing are recognizing job failure; understanding the failure sufficiently to know if and how to restart the job; an...
Charles Earl, Emilio Remolina, Jim Ong, John Brown
IPTPS
2004
Springer
13 years 10 months ago
Friends Troubleshooting Network: Towards Privacy-Preserving, Automatic Troubleshooting
Abstract— Content sharing is a popular usage of peerto-peer systems for its inherent scalability and low cost of maintenance. In this paper, we leverage this nature of peer-to-pe...
Helen J. Wang, Yih-Chun Hu, Chun Yuan, Zheng Zhang...
AAAI
1990
13 years 6 months ago
A Design Based Approach to Constructing Computational Solutions to Diagnostic Problems
Troubleshooting problems in real manufacturing environments impose constraints on admissible solutions that make the computational solutions offered by "troubleshooting from ...
D. Volovik, Imran A. Zualkernan, Paul E. Johnson, ...
HPDC
2006
IEEE
13 years 11 months ago
Troubleshooting Distributed Systems via Data Mining
Through massive parallelism, distributed systems enable the multiplication of productivity. Unfortunately, increasing the scale of available machines to users will also multiply d...
David A. Cieslak, Douglas Thain, Nitesh V. Chawla
GRID
2007
Springer
13 years 11 months ago
Log summarization and anomaly detection for troubleshooting distributed systems
— Today’s system monitoring tools are capable of detecting system failures such as host failures, OS errors, and network partitions in near-real time. Unfortunately, the same c...
Dan Gunter, Brian Tierney, Aaron Brown, D. Martin ...