Sciweavers

5 search results - page 1 / 1
» A Lazy Monitoring Approach for Heartbeat-Style Failure Detec...
Sort
View
IEEEARES
2008
IEEE
13 years 11 months ago
A Lazy Monitoring Approach for Heartbeat-Style Failure Detectors
—Failure detectors are a fundamental part of safe fault-tolerant distributed systems. Many failure detectors use heartbeats to draw conclusions about the state of nodes within a ...
Benjamin Satzger, Andreas Pietzowski, Wolfgang Tru...
SAC
2006
ACM
13 years 4 months ago
Combining supervised and unsupervised monitoring for fault detection in distributed computing systems
Fast and accurate fault detection is becoming an essential component of management software for mission critical systems. A good fault detector makes possible to initiate repair a...
Haifeng Chen, Guofei Jiang, Cristian Ungureanu, Ke...
ASPLOS
2010
ACM
13 years 11 months ago
ConMem: detecting severe concurrency bugs through an effect-oriented approach
Multicore technology is making concurrent programs increasingly pervasive. Unfortunately, it is difficult to deliver reliable concurrent programs, because of the huge and non-det...
Wei Zhang, Chong Sun, Shan Lu
JMLR
2008
159views more  JMLR 2008»
13 years 4 months ago
Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies
When monitoring spatial phenomena, which can often be modeled as Gaussian processes (GPs), choosing sensor locations is a fundamental task. There are several common strategies to ...
Andreas Krause, Ajit Paul Singh, Carlos Guestrin
SIGCOMM
2004
ACM
13 years 10 months ago
A scalable distributed information management system
We present a Scalable Distributed Information Management System (SDIMS) that aggregates information about large-scale networked systems and that can serve as a basic building bloc...
Praveen Yalagandula, Michael Dahlin