Programming Massively Parallel Processors A Hands-On Approach

ISBN-10: 0123814723

ISBN-13: 9780123814722

Edition: 2010

Authors: David B. Kirk, Wen-Mei W. Hwu

List price: $69.95
eBook available
30 day, 100% satisfaction guarantee

If an item you ordered from TextbookRush does not meet your expectations due to an error on our part, simply fill out a return request and then return it by mail within 30 days of ordering it for a full refund of item cost.

Learn more about our returns policy


eBooks Starting from $68.95
Buy eBooks
what's this?
Rush Rewards U
Members Receive:
You have reached 400 XP and carrot coins. That is the daily max!
Study Briefs

Limited time offer: Get the first one free! (?)

All the information you need in one place! Each Study Brief is a summary of one specific subject; facts, figures, and explanations to help you learn faster.

Add to cart
Study Briefs
Periodic Table Online content $4.95 $1.99
Add to cart
Study Briefs
Calculus 1 Online content $4.95 $1.99
Add to cart
Study Briefs
SQL Online content $4.95 $1.99
Add to cart
Study Briefs
MS Excel® 2010 Online content $4.95 $1.99
Customers also bought

Book details

List price: $69.95
Copyright year: 2010
Publisher: Elsevier Science & Technology Books
Publication date: 1/22/2010
Binding: Paperback
Pages: 280
Size: 7.50" wide x 9.25" long x 0.75" tall
Weight: 1.298
Language: English

Wen-mei W. Hwu is the Walter J. ("Jerry") Sanders III-Advanced Micro Devices Endowed Chair in Electrical and Computer Engineering in the Coordinated Science Laboratory of the University of Illinois at Urbana-Champaign. From 1997 to 1999, Dr. Hwu served as the chairman of the Computer Engineering Program at the University of Illinois. Dr. Hwu received his Ph.D. degree in Computer Science from the University of California, Berkeley. His research interests are in the areas of architecture, implementation, and software for high-performance computer systems. He is the director of the OpenIMPACT project, which has delivered new compiler and computer architecture technologies to the computer industry since 1987. He also serves as the Soft Systems Theme leader of the MARCO/DARPA Gigascale Silicon Research Center (GSRC) and on the Executive Committees of both the GSRC and the MARCO/DARPA Center for Circuit and System Solutions. For his contributions to the areas of compiler optimization and computer architecture, he received the 1993 Eta Kappa Nu Outstanding Young Electrical Engineer Award, the 1994 Xerox Award for Faculty Research, the 1994 University Scholar Award of the University of Illinois, the 1997 Eta Kappa Nu Holmes MacDonald Outstanding Teaching Award, the 1998 ACM SigArch Maurice Wilkes Award, the 1999 ACM Grace Murray Hopper Award, the 2001 Tau Beta Pi Daniel C. Drucker Eminent Faculty Award. He served as the Franklin Woeltge Distinguished Professor of Electrical and Computer Engineering from 2000 to 2004. He is a fellow of IEEE and ACM.

GPUs as Parallel Computers
Architecture of a Modern GPU
Why More Speed or Parallelism?
Parallel Programming Languages and Models
Overarching Goals
Organization of the Book
History Of GPU Computing
Evolution of Graphics Pipelines
The Era of Fixed-Function Graphics Pipelines
Evolution of Programmable Real-Time Graphics
Unified Graphics and Computing Processors
GPGPU: An Intermediate Step
GPU Computing
Scalable GPUs
Recent Developments
Future Trends
Introduction To Cuda
Data Parallelism
Cuda Program Structure
A Matrix-Matrix Multiplication Example
Device Memories and Data Transfer
Kernel Functions and Threading
Function declarations
Kernel launch
Predefined variables
Runtime API
Cuda Threads
Cuda Thread Organization
Using blockIdx and threadIdx
Synchronization and Transparent Scalability
Thread Assignment
Thread Scheduling and Latency Tolerance
Cuda� Memories
Importance of Memory Access Efficiency
CUDA Device Memory Types
A Strategy for Reducing Global Memory Traffic
Memory as a Limiting Factor to Parallelism
Performance On Siderations
More on Thread Execution
Global Memory Bandwidth
Dynamic Partitioning of SM Resources
Data Prefetching
Instruction Mix
Thread Granularity
Measured Performance and Summary
Floating Point Considerations
Floating-Point Format
Normalized Representation of M
Excess Encoding of E
Representable Numbers
Special Bit Patterns and Precision
Arithmetic Accuracy and Rounding
Algorithm Considerations
Application Case Study: Advanced MRI Reconstruction
Application Background
Iterative Reconstruction
Computing F<sup>H</sup>d
Determine the Kernel Parallelism Structure
Getting Around the Memory Bandwidth Limitation
Using Hardware Trigonometry Functions
Experimental Performance Tuning
Final Evaluation
Application Case Study: Molecular Visualization and Analysis
Application Background
A Simple Kernel Implementation
Instruction Execution Efficiency
Memory Coalescing
Additional Performance Comparisons
Using Multiple GPUs
Parallel Programming and Computational Thinking
Goals of Parallel Programming
Problem Decomposition
Algorithm Selection
Computational Thinking
A Brief Introduction To Opencl�
Data Parallelism Model
Device Architecture
Kernel Functions
Device Management and Kernel Launch
Electrostatic Potential Map in OpenCL
Conclusion And Future Outlook
Goals Revisited
Memory Architecture Evolution
Large Virtual and Physical Address Spaces
Unified Device Memory Space
Configurable Caching and Scratch Pad
Enhanced Atomic Operations
Enhanced Global Memory Access
Kernel Execution Control Evolution
Function Calls within Kernel Functions
Exception Handling in Kernel Functions
Simultaneous Execution of Multiple Kernels
Interruptible Kernels
Core Performance
Double-Precision Speed
Better Control Flow Efficiency
Programming Environment
A Bright Outlook
Matrix Multiplication Host-Only Version Source Code
matrixmul . cu
matrixmul . h
Expected Output
GPU Compute Capabilities
GPU Compute Capability Tables
Memory Coalescing Variations
Free shipping on orders over $35*

*A minimum purchase of $35 is required. Shipping is provided via FedEx SmartPost® and FedEx Express Saver®. Average delivery time is 1 – 5 business days, but is not guaranteed in that timeframe. Also allow 1 - 2 days for processing. Free shipping is eligible only in the continental United States and excludes Hawaii, Alaska and Puerto Rico. FedEx service marks used by permission."Marketplace" orders are not eligible for free or discounted shipping.

Learn more about the TextbookRush Marketplace.