| |
| |
| |
Fundamentals of Computer Design | |
| |
| |
| |
Introduction | |
| |
| |
| |
The Changing Face of Computing and the Task of the Computer Designer | |
| |
| |
| |
Technology Trends | |
| |
| |
| |
Cost, Price, and their Trends | |
| |
| |
| |
Measuring and Reporting Performance | |
| |
| |
| |
Quantitative Principles of Computer Design | |
| |
| |
| |
Putting It All Together: Performance and Price-Performance | |
| |
| |
| |
Another View: Power Consumption and Efficiency as the Metric | |
| |
| |
| |
Fallacies and Pitfalls | |
| |
| |
| |
Concluding Remarks | |
| |
| |
| |
Historical Perspective and References Exercises | |
| |
| |
| |
Instruction Set Principles and Examples | |
| |
| |
| |
Introduction | |
| |
| |
| |
Classifying Instruction Set Architectures | |
| |
| |
| |
Memory Addressing | |
| |
| |
| |
Addressing Modes for Signal Processing | |
| |
| |
| |
Type and Size of Operands | |
| |
| |
| |
Operands for Media and Signal Processing | |
| |
| |
| |
Operations in the Instruction Set | |
| |
| |
| |
Operations for Media and Signal Processing | |
| |
| |
| |
Instructions for Control Flow | |
| |
| |
| |
Encoding an Instruction Set | |
| |
| |
| |
Crosscutting Issues: The Role of Compilers | |
| |
| |
| |
Putting It All Together: The MIPS Architecture | |
| |
| |
| |
Another View: The Trimedia TM32 CPU | |
| |
| |
| |
Fallacies and Pitfalls | |
| |
| |
| |
Concluding Remarks | |
| |
| |
| |
Historical Perspective and References Exercises | |
| |
| |
| |
Instruction-Level Parallelism and its Dynamic Exploitation | |
| |
| |
| |
Instruction-Level Parallelism: Concepts and Challenges | |
| |
| |
| |
Overcoming Data Hazards with Dynamic Scheduling | |
| |
| |
| |
Dynamic Scheduling: Examples and the Algorithm | |
| |
| |
| |
Reducing Branch Costs with Dynamic Hardware Prediction | |
| |
| |
| |
High Performance Instruction Delivery | |
| |
| |
| |
Taking Advantage of More ILP with Multiple Issue | |
| |
| |
| |
Hardware Based Speculation | |
| |
| |
| |
Studies of the Limitations of ILP | |
| |
| |
| |
Limitations on ILP for Realizable Processors | |
| |
| |
| |
Putting It All Together: The P6 Microarchitecture | |
| |
| |
| |
Another View: Thread Level Parallelism | |
| |
| |
| |
Crosscutting Issues: Using an ILP Datapath to Exploit TLP | |
| |
| |
| |
Fallacies and Pitfalls | |
| |
| |
| |
Concluding Remarks | |
| |
| |
| |
Historical Perspective and References Exercises | |
| |
| |
| |
Exploiting Instruction Level Parallelism with Software Approaches | |
| |
| |
| |
Basic Compiler Techniques for Exposing ILP | |
| |
| |
| |
Static Branch Prediction | |
| |
| |
| |
Static Multiple Issue: the VLIW Approach | |
| |
| |
| |
Advanced Compiler Support for Exposing and Exploiting ILP | |
| |
| |
| |
Hardware Support for Exposing More Parallelism at Compile-Time | |
| |
| |
| |
Crosscutting Issues | |
| |
| |
| |
Putting It All Together: The Intel IA-64 Architecture and Itanium Processor | |
| |
| |
| |
Another View: ILP in the Embedded and Mobile Markets | |
| |
| |
| |
Fallacies and Pitfalls | |
| |
| |
| |
Concluding Remarks | |
| |
| |
| |
Historical Perspective and References Exercises | |
| |
| |
| |
Memory-Hierarchy Design | |
| |
| |
| |
Introduction | |
| |
| |
| |
Review of the ABCs of Caches | |
| |
| |
| |
Cache Performance | |
| |
| |
| |
Reducing Cache Miss Penalty | |
| |
| |
| |
Reducing Miss Rate | |
| |
| |
| |
Reducing Cache Miss Penalty or Miss Rate via Parallelism | |
| |
| |
| |
Reducing Hit Time | |
| |
| |
| |
Main Memory and Organizations for Improving Performance | |
| |
| |
| |
Memory Technology | |
| |
| |
| |
Virtual Memory | |
| |
| |
| |
Protection and Examples of Virtual Memory | |
| |
| |
| |
Crosscutting Issues in the Design of Memory Hierarchies | |
| |
| |
| |
Putting It All Together: Alpha 21264 Memory Hierarchy | |
| |
| |
| |
Another View: The Emotion Engine of the Sony Playstation 2 | |
| |
| |
| |
Another View: The Sun Fire 6800 Server | |
| |
| |
| |
Fallacies and Pitfalls | |
| |
| |
| |
Concluding Remarks | |
| |
| |
| |
Historical Perspective and References Exercises | |
| |
| |
| |
Multiprocessors and Thread-Level Parallelism | |
| |
| |
| |
Introduction | |
| |
| |
| |
Characteristics of Application Domains | |
| |
| |
| |
Symmetric Shared-Memory Architectures | |
| |
| |
| |
Performance of Symmetric Shared-Memory Multiprocessors | |
| |
| |
| |
Distributed Shared-Memory Architectures | |
| |
| |
| |
Performance of Distributed Shared-Memory Multiprocessors | |
| |
| |
| |
Synchronization | |
| |
| |
| |
Models of Memory Consistency: An Introduction | |
| |
| |
| |
Multithreading: Exploiting Thread-Level Parallelism within a Processor | |
| |
| |
| |
Crosscutting Issues | |
| |
| |
| |
Putting It All Together: Sun''s Wildfire Prototype | |
| |
| |
| |
Another View: Multithreading in a Commercial Server | |
| |
| |
| |
Another View: Embedded Multiprocessors | |
| |
| |
| |
Fallacies and Pitfalls | |
| |
| |
| |
Concluding Remarks | |
| |
| |
| |
Historical Perspective and References Exercises Cha | |