Sciweavers

ISCA
2012
IEEE
320views Hardware» more  ISCA 2012»
11 years 7 months ago
Viper: Virtual pipelines for enhanced reliability
The reliability of future processors is threatened by decreasing transistor robustness. Current architectures focus on delivering high performance at low cost; lifetime device rel...
Andrea Pellegrini, Joseph L. Greathouse, Valeria B...
ASPLOS
2012
ACM
12 years 8 days ago
Relyzer: exploiting application-level fault equivalence to analyze application resiliency to transient faults
Future microprocessors need low-cost solutions for reliable operation in the presence of failure-prone devices. A promising approach is to detect hardware faults by deploying low-...
Siva Kumar Sastry Hari, Sarita V. Adve, Helia Naei...
DAC
2011
ACM
12 years 4 months ago
DRAIN: distributed recovery architecture for inaccessible nodes in multi-core chips
As transistor dimensions continue to scale deep into the nanometer regime, silicon reliability is becoming a chief concern. At the same time, transistor counts are scaling up, ena...
Andrew DeOrio, Konstantinos Aisopos, Valeria Berta...
ISCA
2000
IEEE
99views Hardware» more  ISCA 2000»
13 years 9 months ago
Transient fault detection via simultaneous multithreading
Smaller feature sizes, reduced voltage levels, higher transistor counts, and reduced noise margins make future generations of microprocessors increasingly prone to transient hardw...
Steven K. Reinhardt, Shubhendu S. Mukherjee
GI
2003
Springer
13 years 9 months ago
Byzantine Failures and Security: Arbitrary is not (always) Random
: The Byzantine failure model allows arbitrary behavior of a certain fraction of network nodes in a distributed system. It was introduced to model and analyze the effects of very s...
Felix C. Gärtner
SAC
2005
ACM
13 years 10 months ago
An agent model for fault-tolerant systems
This paper describes the use of fault tolerance in a multiagent system. Such an approach is based on the modeling of autonomous agents with planning capabilities. These capabiliti...
Avelino F. Zorzo, Felipe Rech Meneguzzi
MICRO
2009
IEEE
128views Hardware» more  MICRO 2009»
13 years 11 months ago
mSWAT: low-cost hardware fault detection and diagnosis for multicore systems
Continued technology scaling is resulting in systems with billions of devices. Unfortunately, these devices are prone to failures from various sources, resulting in even commodity...
Siva Kumar Sastry Hari, Man-Lap Li, Pradeep Ramach...