Modern Processor Design Fundamentals of Superscalar Processors

ISBN-10: 0070570647
ISBN-13: 9780070570641
Edition: 2005
List price: $173.33
30 day, 100% satisfaction guarantee

If an item you ordered from TextbookRush does not meet your expectations due to an error on our part, simply fill out a return request and then return it by mail within 30 days of ordering it for a full refund of item cost.

Learn more about our returns policy

what's this?
Rush Rewards U
Members Receive:
coins
coins
You have reached 400 XP and carrot coins. That is the daily max!

Study Briefs

Limited time offer: Get the first one free! (?)

All the information you need in one place! Each Study Brief is a summary of one specific subject; facts, figures, and explanations to help you learn faster.

Add to cart
Study Briefs
Periodic Table Online content $4.95 $1.99
Add to cart
Study Briefs
Writing a Scientific Report Online content $4.95 $1.99

Customers also bought

Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading

Book details

List price: $173.33
Copyright year: 2005
Publisher: McGraw-Hill Higher Education
Publication date: 7/7/2004
Binding: Hardcover
Pages: 656
Size: 7.50" wide x 9.25" long x 1.25" tall
Weight: 2.530
Language: English

Table of Contents
Additional Resources
Preface
Processor Design
The Evolution of Microprocessors
Instruction Set Processor Design
Digital Systems Design
Architecture, Implementation, and Realization
Instruction Set Architecture
Dynamic-Static Interface
Principles of Processor Performance
Processor Performance Equation
Processor Performance Optimizations
Performance Evaluation Method
Instruction-Level Parallel Processing
From Scalar to Superscalar
Limits of Instruction-Level Parallelism
Machines for Instruction-Level Parallelism
Summary
Pipelined Processors
Pipelining Fundamentals
Pipelined Design
Arithmetic Pipeline Example
Pipelining Idealism
Instruction Pipelining
Pipelined Processor Design
Balancing Pipeline Stages
Unifying Instruction Types
Minimizing Pipeline Stalls
Commercial Pipelined Processors
Deeply Pipelined Processors
Summary
Memory and I/O Systems
Introduction
Computer System Overview
Key Concepts: Latency and Bandwidth
Memory Hierarchy
Components of a Modern Memory Hierarchy
Temporal and Spatial Locality
Caching and Cache Memories
Main Memory
Virtual Memory Systems
Demand Paging
Memory Protection
Page Table Architectures
Memory Hierarchy Implementation
Input/Output Systems
Types of I/O Devices
Computer System Busses
Communication with I/O Devices
Interaction of I/O Devices and Memory Hierarchy
Summary
Superscalar Organization
Limitations of Scalar Pipelines
Upper Bound on Scalar Pipeline Throughput
Inefficient Unification into a Single Pipeline
Performance Lost Due to a Rigid Pipeline
From Scalar to Superscalar Pipelines
Parallel Pipelines
Diversified Pipelines
Dynamic Pipelines
Superscalar Pipeline Overview
Instruction Fetching
Instruction Decoding
Instruction Dispatching
Instruction Execution
Instruction Completion and Retiring
Summary
Superscalar Techniques
Instruction Flow Techniques
Program Control Flow and Control Dependences
Performance Degradation Due to Branches
Branch Prediction Techniques
Branch Misprediction Recovery
Advanced Branch Prediction Techniques
Other Instruction Flow Techniques
Register Data Flow Techniques
Register Reuse and False Data Dependences
Register Renaming Techniques
True Data Dependences and the Data Flow Limit
The Classic Tomasulo Algorithm
Dynamic Execution Core
Reservation Stations and Reorder Buffer
Dynamic Instruction Scheduler
Other Register Data Flow Techniques
Memory Data Flow Techniques
Memory Accessing Instructions
Ordering of Memory Accesses
Load Bypassing and Load Forwarding
Other Memory Data Flow Techniques
Summary
The PowerPC 620
Introduction
Experimental Framework
Instruction Fetching
Branch Prediction
Fetching and Speculation
Instruction Dispatching
Instruction Buffer
Dispatch Stalls
Dispatch Effectiveness
Instruction Execution
Issue Stalls
Execution Parallelism
Execution Latency
Instruction Completion
Completion Parallelism
Cache Effects
Conclusions and Observations
Bridging to the IBM POWER3 and POWER4
Summary
Intel's P6 Microarchitecture
Introduction
Basics of the P6 Microarchitecture
Pipelining
In-Order Front-End Pipeline
Out-of-Order Core Pipeline
Retirement Pipeline
The In-Order Front End
Instruction Cache and ITLB
Branch Prediction
Instruction Decoder
Register Alias Table
Allocator
The Out-of-Order Core
Reservation Station
Retirement
The Reorder Buffer
Memory Subsystem
Memory Access Ordering
Load Memory Operations
Basic Store Memory Operations
Deferring Memory Operations
Page Faults
Summary
Acknowledgments
Survey of Superscalar Processors
Development of Superscalar Processors
Early Advances in Uniprocessor Parallelism: The IBM Stretch
First Superscalar Design: The IBM Advanced Computer System
Instruction-Level Parallelism Studies
By-Products of DAE: The First Multiple-Decoding Implementations
IBM Cheetah, Panther, and America
Decoupled Microarchitectures
Other Efforts in the 1980s
Wide Acceptance of Superscalar
A Classification of Recent Designs
RISC and CISC Retrofits
Speed Demons: Emphasis on Clock Cycle Time
Brainiacs: Emphasis on IPC
Processor Descriptions
Compaq / DEC Alpha
Hewlett-Packard PA-RISC Version 1.0
Hewlett-Packard PA-RISC Version 2.0
IBM POWER
Intel i960
Intel IA32--Native Approaches
Intel IA32--Decoupled Approaches
x86-64
MIPS
Motorola
PowerPC--32-bit Architecture
PowerPC--64-bit Architecture
PowerPC-AS
SPARC Version 8
SPARC Version 9
Verification of Superscalar Processors
Acknowledgments
Advanced Instruction Flow Techniques
Introduction
Static Branch Prediction Techniques
Single-Direction Prediction
Backwards Taken/Forwards Not-Taken
Ball/Larus Heuristics
Profiling
Dynamic Branch Prediction Techniques
Basic Algorithms
Interference-Reducing Predictors
Predicting with Alternative Contexts
Hybrid Branch Predictors
The Tournament Predictor
Static Predictor Selection
Branch Classification
The Multihybrid Predictor
Prediction Fusion
Other Instruction Flow Issues and Techniques
Target Prediction
Branch Confidence Prediction
High-Bandwidth Fetch Mechanisms
High-Frequency Fetch Mechanisms
Summary
Advanced Register Data Flow Techniques
Introduction
Value Locality and Redundant Execution
Causes of Value Locality
Quantifying Value Locality
Exploiting Value Locality without Speculation
Memoization
Instruction Reuse
Basic Block and Trace Reuse
Data Flow Region Reuse
Concluding Remarks
Exploiting Value Locality with Speculation
The Weak Dependence Model
Value Prediction
The Value Prediction Unit
Speculative Execution Using Predicted Values
Performance of Value Prediction
Concluding Remarks
Summary
Executing Multiple Threads
Introduction
Synchronizing Shared-Memory Threads
Introduction to Multiprocessor Systems
Fully Shared Memory, Unit Latency, and Lack of Contention
Instantaneous Propagation of Writes
Coherent Shared Memory
Implementing Cache Coherence
Multilevel Caches, Inclusion, and Virtual Memory
Memory Consistency
The Coherent Memory Interface
Concluding Remarks
Explicitly Multithreaded Processors
Chip Multiprocessors
Fine-Grained Multithreading
Coarse-Grained Multithreading
Simultaneous Multithreading
Implicitly Multithreaded Processors
Resolving Control Dependences
Resolving Register Data Dependences
Resolving Memory Data Dependences
Concluding Remarks
Executing the Same Thread
Fault Detection
Prefetching
Branch Resolution
Concluding Remarks
Summary
Index

×
Free shipping on orders over $35*

*A minimum purchase of $35 is required. Shipping is provided via FedEx SmartPost® and FedEx Express Saver®. Average delivery time is 1 – 5 business days, but is not guaranteed in that timeframe. Also allow 1 - 2 days for processing. Free shipping is eligible only in the continental United States and excludes Hawaii, Alaska and Puerto Rico. FedEx service marks used by permission."Marketplace" orders are not eligible for free or discounted shipping.

Learn more about the TextbookRush Marketplace.

×