Fault-Tolerant Systems

ISBN-10: 0120885255
ISBN-13: 9780120885251
Edition: 2007
List price: $89.95 Buy it from $19.62
eBook available
This item qualifies for FREE shipping

*A minimum purchase of $35 is required. Shipping is provided via FedEx SmartPost® and FedEx Express Saver®. Average delivery time is 1 – 5 business days, but is not guaranteed in that timeframe. Also allow 1 - 2 days for processing. Free shipping is eligible only in the continental United States and excludes Hawaii, Alaska and Puerto Rico. FedEx service marks used by permission."Marketplace" orders are not eligible for free or discounted shipping.

30 day, 100% satisfaction guarantee

If an item you ordered from TextbookRush does not meet your expectations due to an error on our part, simply fill out a return request and then return it by mail within 30 days of ordering it for a full refund of item cost.

Learn more about our returns policy

Description: There are many applications in which the reliability of the overall system must be far higher than the reliability of its individual components. In such cases, designers devise mechanisms and architectures that allow the system to either completely  More...

Used Starting from $55.72
New Starting from $82.81
eBooks Starting from $85.95
Buy
what's this?
Rush Rewards U
Members Receive:
coins
coins
You have reached 400 XP and carrot coins. That is the daily max!
You could win $10,000

Get an entry for every item you buy, rent, or sell.

Study Briefs

Limited time offer: Get the first one free! (?)

All the information you need in one place! Each Study Brief is a summary of one specific subject; facts, figures, and explanations to help you learn faster.

Add to cart
Study Briefs
Sociology Online content $4.95 $1.99
Add to cart
Study Briefs
Business Ethics Online content $4.95 $1.99
Add to cart
Study Briefs
Business Law Online content $4.95 $1.99

Customers also bought

Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading

Book details

List price: $89.95
Copyright year: 2007
Publisher: Elsevier Science & Technology Books
Publication date: 3/1/2007
Binding: Hardcover
Pages: 400
Size: 7.50" wide x 9.25" long x 1.00" tall
Weight: 2.046
Language: English

There are many applications in which the reliability of the overall system must be far higher than the reliability of its individual components. In such cases, designers devise mechanisms and architectures that allow the system to either completely mask the effects of a component failure or recover from it so quickly that the application is not seriously affected. This is the work of fault-tolerant designers and their work is increasingly important and complex not only because of the increasing number of mission critical applications, but also because the diminishing reliability of hardware means that even systems for non-critical applications will need to be designed with fault-tolerance in mind. Reflecting the real-world challenges faced by designers of these systems, this book addresses fault tolerance design with a systems approach to both hardware and software. No other text on the market takes this approach, nor offers the comprehensive and up-to-date treatment Koren and Krishna provide. Students, designers and architects of high performance processors will value this comprehensive overview of the field. * The first book on fault tolerance design with a systems approach * Comprehensive coverage of both hardware and software fault tolerance, as well as information and time redundancy * Incorporated case studies highlight six different computer systems with fault-tolerance techniques implemented in their design * Available to lecturers is a complete ancillary package including online solutions manual for instructors and PowerPoint slides

Foreword
Preface
Acknowledgements
About the Authors
Preliminaries
Fault Classification
Types of Redundancy
Basic Measures of Fault Tolerance
Traditional Measures
Network Measures
Outline of This Book
Further Reading
References
Hardware Fault Tolerance
The Rate of Hardware Failures
Failure Rate, Reliability, and Mean Time to Failure
Canonical and Resilient Structures
Series and Parallel Systems
Non-Series/Parallel Systems
M-of-N Systems
Voters
Variations on N-Modular Redundancy
Duplex Systems
Other Reliability Evaluation Techniques
Poisson Processes
Markov Models
Fault-Tolerance Processor-Level Techniques
Watchdog Processor
Simultaneous Multithreading for Fault Tolerance
Byzantine Failures
Byzantine Agreement with Message Authentication
Further Reading
Exercises
References
Information Redundancy
Coding
Parity Codes
Checksum
M-of-N Codes
Berger Code
Cyclic Codes
Arithmetic Codes
Resilient Disk Systems
Raid Level 1
Raid Level 2
Raid Level 3
Raid Level 4
Raid Level 5
Modeling Correlated Failures
Data Replication
Voting: Non-Hierarchical Organization
Voting: Hierarchical Organization
Primary-Backup Approach
Algorithm-Based Fault Tolerance
Further Reading
Exercises
References
Fault-Tolerant Networks
Measures of Resilience
Graph-Theoretical Measures
Computer Networks Measures
Common Network Topologies and Their Resilience
Multistage and Extra-Stage Networks
Crossbar Networks
Rectangular Mesh and Interstitial Mesh
Hypercube Network
Cube-Connected Cycles Networks
Loop Networks
Ad hoc Point-to-Point Networks
Fault-Tolerant Routing
Hypercube Fault-Tolerant Routing
Origin-Based Routing in the Mesh
Further Reading
Exercises
References
Software Fault Tolerance
Acceptance Tests
Single-Version Fault Tolerance
Wrappers
Software Rejuvenation
Data Diversity
Software Implemented Hardware Fault Tolerance (SIHFT)
N-Version Programming
Consistent Comparison Problem
Version Independence
Recovery Block Approach
Basic Principles
Success Probability Calculation
Distributed Recovery Blocks
Preconditions, Postconditions, and Assertions
Exception-Handling
Requirements from Exception-Handlers
Basics of Exceptions and Exception-Handling
Language Support
Software Reliability Models
Jelinski-Moranda Model
Littlewood-Verrall Model
Musa-Okumoto Model
Model Selection and Parameter Estimation
Fault-Tolerant Remote Procedure Calls
Primary-Backup Approach
The Circus Approach
Further Reading
Exercises
References
Checkpointing
What is Checkpointing?
Why is Checkpointing Nontrivial?
Checkpoint Level
Optimal Checkpointing-An Analytical Model
Time Between Checkpoints-A First-Order Approximation
Optimal Checkpoint Placement
Time Between Checkpoints-A More Accurate Model
Reducing Overhead
Reducing Latency
Cache-Aided Rollback Error Recovery (CARER)
Checkpointing in Distributed Systems
The Domino Effect and Livelock
A Coordinated Checkpointing Algorithm
Time-Based Synchronization
Diskless Checkpointing
Message Logging
Checkpointing in Shared-Memory Systems
Bus-Based Coherence Protocol
Directory-Based Protocol
Checkpointing in Real-Time Systems
Other Uses of Checkpointing
Further Reading
Exercises
References
Case Studies
NonStop Systems
Architecture
Maintenance and Repair Aids
Software
Modifications to the NonStop Architecture
Stratus Systems
Cassini Command and Data Subsystem
IBM G5
IBM Sysplex
Itanium
Further Reading
References
Defect Tolerance in VLSI Circuits
Manufacturing Defects and Circuit Faults
Probability of Failure and Critical Area
Basic Yield Models
The Poisson and Compound Poisson Yield Models
Variations on the Simple Yield Models
Yield Enhancement Through Redundancy
Yield Projection for Chips with Redundancy
Memory Arrays with Redundancy
Logic Integrated Circuits with Redundancy
Modifying the Floorplan
Further Reading
Exercises
References
Fault Detection in Cryptographic Systems
Overview of Ciphers
Symmetric Key Ciphers
Public Key Ciphers
Security Attacks Through Fault Injection
Fault Attacks on Symmetric Key Ciphers
Fault Attacks on Public (Asymmetric) Key Ciphers
Countermeasures
Spatial and Temporal Duplication
Error-Detecting Codes
Are These Countermeasures Sufficient?
Final Comment
Further Reading
Exercises
References
Simulation Techniques
Writing a Simulation Program
Parameter Estimation
Point Versus Interval Estimation
Method of Moments
Method of Maximum Likelihood
The Bayesian Approach to Parameter Estimation
Confidence Intervals
Variance Reduction Methods
Antithetic Variables
Using Control Variables
Stratified Sampling
Importance Sampling
Random Number Generation
Uniformly Distributed Random Number Generators
Testing Uniform Random Number Generators
Generating Other Distributions
Fault Injection
Types of Fault Injection Techniques
Fault Injection Application and Tools
Further Reading
Exercises
References
Subject Index

×
Free shipping on orders over $35*

*A minimum purchase of $35 is required. Shipping is provided via FedEx SmartPost® and FedEx Express Saver®. Average delivery time is 1 – 5 business days, but is not guaranteed in that timeframe. Also allow 1 - 2 days for processing. Free shipping is eligible only in the continental United States and excludes Hawaii, Alaska and Puerto Rico. FedEx service marks used by permission."Marketplace" orders are not eligible for free or discounted shipping.

Learn more about the TextbookRush Marketplace.

×