| |
| |
List of Figures | |
| |
| |
List of Tables | |
| |
| |
Preface | |
| |
| |
Acknowledgments | |
| |
| |
Trademarks | |
| |
| |
| |
Architecture and Implementation | |
| |
| |
| |
Analogy: Piano Architecture | |
| |
| |
| |
Types of Computer Languages | |
| |
| |
| |
Why Study Assembly Language? | |
| |
| |
| |
Prefixes for Binary Multiples | |
| |
| |
| |
Instruction Set Architectures | |
| |
| |
| |
The Life Cycle of Computer Architectures | |
| |
| |
| |
SQUARES: A First Programming Example | |
| |
| |
| |
Review of Number Systems | |
| |
| |
| |
Computer Structures and Data Representations | |
| |
| |
| |
Computer Structures | |
| |
| |
| |
Instruction Execution | |
| |
| |
| |
Classes of Instruction Set Architectures | |
| |
| |
| |
Migration to 64-Bit Architectures | |
| |
| |
| |
Itanium Information Units and Data Types | |
| |
| |
| |
The Program Assembler and Debugger | |
| |
| |
| |
Programming Environments | |
| |
| |
| |
Program Development Steps | |
| |
| |
| |
Comparing Variants of a Source File | |
| |
| |
| |
Assembler Statement Types | |
| |
| |
| |
The Functions of a Symbolic Assembler | |
| |
| |
| |
The Assembly Process | |
| |
| |
| |
The Linking Process | |
| |
| |
| |
The Program Debugger | |
| |
| |
| |
Conventions for Writing Programs | |
| |
| |
| |
Itanium Instruction Formats and Addressing | |
| |
| |
| |
Overview of Itanium Instruction Formats | |
| |
| |
| |
Integer Arithmetic Instructions | |
| |
| |
| |
Bit Encoding for Itanium Instructions | |
| |
| |
| |
HEXNUM: Using Arithmetic Instructions | |
| |
| |
| |
Data Access Instructions | |
| |
| |
| |
Other ALU Instructions | |
| |
| |
| |
DOTPROD: Using Data Access Instructions | |
| |
| |
| |
Itanium Addressing Modes | |
| |
| |
| |
Addressing in Other Architectures | |
| |
| |
| |
Comparison, Branches, and Predication | |
| |
| |
| |
Hardware Basis for Control of Flow | |
| |
| |
| |
Integer Compare Instructions | |
| |
| |
| |
Program Branching | |
| |
| |
| |
DOTLOOP: Using a Counted Loop | |
| |
| |
| |
Stops, Instruction Groups, and Performance | |
| |
| |
| |
DOTCLOOP: Using the Loop Count Register | |
| |
| |
| |
Other Structured Programming Constructs | |
| |
| |
| |
MAXIMUM: Using Conditional Instructions | |
| |
| |
| |
Logical Operations, Bit-Shifts, and Bytes | |
| |
| |
| |
Logical Functions | |
| |
| |
| |
HEXNUM2: Using Logical Masks | |
| |
| |
| |
Bit and Field Operations | |
| |
| |
| |
SCANTEXT: Processing Bytes | |
| |
| |
| |
Integer Multiplication and Division | |
| |
| |
| |
DECNUM: Converting an Integer to Decimal Format | |
| |
| |
| |
Using C for ASCII Input and Output | |
| |
| |
| |
BACKWARD: Using Byte Manipulations | |
| |
| |
| |
Subroutines, Procedures, and Functions | |
| |
| |
| |
Memory Stacks | |
| |
| |
| |
DECNUM2: Using Stack Operations | |
| |
| |
| |
Register Stacks | |
| |
| |
| |
Program Segmentation | |
| |
| |
| |
Calling Conventions | |
| |
| |
| |
DECNUM3 and BOOTH: Making a Function | |
| |
| |
| |
Integer Quotients and Remainders | |
| |
| |
| |
RANDOM: A Callable Function | |
| |
| |
| |
Floating-Point Operations | |
| |
| |
| |
Parallels Between Integer and Floating-Point Instructions | |
| |
| |
| |
Representations of Floating-Point Values | |
| |
| |
| |
Copying Floating-Point Data | |
| |
| |
| |
Floating-Point Arithmetic Instructions | |
| |
| |
| |
HORNER: Evaluating a Polynomial | |
| |
| |
| |
Predication Based on Floating-Point Values | |
| |
| |
| |
Integer Operations in Floating-Point Execution Units | |
| |
| |
| |
Approximations for Reciprocals and Square Roots | |
| |
| |
| |
APPROXPI: Using Floating-Point Instructions | |
| |
| |
| |
Input and Output of Text | |
| |
| |
| |
File Systems | |
| |
| |
| |
Keyboard and Display I/O | |
| |
| |
| |
SCANTERM: Using C Standard I/O | |
| |
| |
| |
SORTSTR: Sorting Strings | |
| |
| |
| |
Text File I/O | |
| |
| |
| |
SCANFILE: Input and Output with Files | |
| |
| |
| |
SORTINT: Sorting Integers from a File | |
| |
| |
| |
Binary Files | |
| |
| |
| |
Performance Considerations | |
| |
| |
| |
Processor-Level Parallelism | |
| |
| |
| |
Instruction-Level Parallelism | |
| |
| |
| |
Explicit Parallelism in the Itanium Processors | |
| |
| |
| |
Software-Pipelined Loops | |
| |
| |
| |
Modulo Scheduling a Loop | |
| |
| |
| |
Program Optimization Factors | |
| |
| |
| |
Fibonacci Numbers | |
| |
| |
| |
Looking at Output from Compilers | |
| |
| |
| |
Compilers for RISC-like Systems | |
| |
| |
| |
Compiling a Simple Program | |
| |
| |
| |
Optimizing a Simple Program | |
| |
| |
| |
Inline Optimizations | |
| |
| |
| |
Profile-Guided or Other Optimizations | |
| |
| |
| |
Debugging Optimized Programs | |
| |
| |
| |
Recursion for Fibonacci Numbers Revisited | |
| |
| |
| |
Parallel Operations | |
| |
| |
| |
Classification of Computing Systems | |
| |
| |
| |
Integer Parallel Operations | |
| |
| |
| |
Applications to Integer Multiplication | |
| |
| |
| |
Opportunities and Challenges | |
| |
| |
| |
Floating-Point Parallel Operations | |
| |
| |
| |
Semaphore Support for Parallel Processes | |
| |
| |
| |
Variations Among Implementations | |
| |
| |
| |
Why Implementations Change | |
| |
| |
| |
How Implementations Change | |
| |
| |
| |
The Original Itanium Processor | |
| |
| |
| |
A Major Role for Software | |
| |
| |
| |
IA-32 Instruction Set Mode | |
| |
| |
| |
Determining Extensions and Implementation Version | |
| |
| |
| |
Command-Line Environments | |
| |
| |
| |
Suggested System Resources | |
| |
| |
| |
System Hardware | |
| |
| |
| |
System Software | |
| |
| |
| |
Desktop Client Access Software | |
| |
| |
| |
Itanium Instruction Set | |
| |
| |
| |
Instructions Listed by Function | |
| |
| |
| |
Instructions Listed by Assembler Opcode | |
| |
| |
| |
Itanium Registers and Their Uses | |
| |
| |
| |
Instruction Pointer | |
| |
| |
| |
General Registers and NaT Bits | |
| |
| |
| |
Predicate Registers | |
| |
| |
| |
Branch Registers | |
| |
| |
| |
Floating-Point Registers | |
| |
| |
| |
Application Registers | |
| |
| |
| |
State Management Registers | |
| |
| |
| |
System Information Registers | |
| |
| |
| |
System Control Registers | |
| |
| |
| |
Conditional Assembly and Macros (GCC Assembler) | |
| |
| |
| |
Interference from Explicit Stops | |
| |
| |
| |
Repeat Blocks | |
| |
| |
| |
Conditional Assembly | |
| |
| |
| |
Macro Processing | |
| |
| |
| |
Using Labels with Macros | |
| |
| |
| |
Recursive Macros | |
| |
| |
| |
Object File Sections | |
| |
| |
| |
MONEY: A Macro Illustrating Sections | |
| |
| |
| |
Inline Assembly | |
| |
| |
| |
HP-UX C Compilers | |
| |
| |
| |
GCC Compiler for Linux | |
| |
| |
| |
Intel Compilers for Linux | |
| |
| |
Bibliography | |
| |
| |
Answers and Hints for Selected Exercises | |
| |
| |
Chapter 1 | |
| |
| |
Chapter 2 | |
| |
| |
Chapter 3 | |
| |
| |
Chapter 4 | |
| |
| |
Chapter 5 | |
| |
| |
Chapter 6 | |
| |
| |
Chapter 7 | |
| |
| |
Chapter 8 | |
| |
| |
Chapter 9 | |
| |
| |
Chapter 10 | |
| |
| |
Chapter 11 | |
| |
| |
Chapter 12 | |
| |
| |
Chapter 13 | |
| |
| |
About the Authors | |
| |
| |
Index | |