Introduction to High Performance Computing for Scientists and Eng

ISBN-10: 143981192X

ISBN-13: 9781439811924

Edition: 2010

List price: $46.99 Buy it from $21.83
This item qualifies for FREE shipping

*A minimum purchase of $35 is required. Shipping is provided via FedEx SmartPost® and FedEx Express Saver®. Average delivery time is 1 – 5 business days, but is not guaranteed in that timeframe. Also allow 1 - 2 days for processing. Free shipping is eligible only in the continental United States and excludes Hawaii, Alaska and Puerto Rico. FedEx service marks used by permission."Marketplace" orders are not eligible for free or discounted shipping.

30 day, 100% satisfaction guarantee

If an item you ordered from TextbookRush does not meet your expectations due to an error on our part, simply fill out a return request and then return it by mail within 30 days of ordering it for a full refund of item cost.

Learn more about our returns policy

Description:

New Starting from $91.38
what's this?
Rush Rewards U
Members Receive:
coins
coins
You have reached 400 XP and carrot coins. That is the daily max!

Study Briefs

Limited time offer: Get the first one free! (?)

All the information you need in one place! Each Study Brief is a summary of one specific subject; facts, figures, and explanations to help you learn faster.

Add to cart
Study Briefs
Periodic Table Online content $4.95 $1.99
Add to cart
Study Briefs
Calculus 1 Online content $4.95 $1.99
Add to cart
Study Briefs
SQL Online content $4.95 $1.99
Add to cart
Study Briefs
MS Excel® 2010 Online content $4.95 $1.99

Customers also bought

Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading

Book details

List price: $46.99
Copyright year: 2010
Publisher: CRC Press
Publication date: 7/7/2010
Binding: Paperback
Pages: 356
Size: 5.75" wide x 9.00" long x 0.75" tall
Weight: 1.144
Language: English

University of Erlangen Nuremberg, Germany

Foreword
Preface
About the authors
List of acronyms and abbreviations
Modern processors
Stored-program computer architecture
General-purpose cache-based microprocessor architecture
Performance metrics and benchmarks
Transistors galore: Moore's Law
Pipelining
Superscalarity
SIMD
Memory hierarchies
Cache
Cache mapping
Prefetch
Multicore processors
Multithreaded processors
Vector processors
Design principles
Maximum performance estimates
Programming for vector architectures
Basic optimization techniques for serial code
Scalar profiling
Function- and line-based runtime profiling
Hardware performance counters
Manual instrumentation
Common sense optimizations
Do less work!
Avoid expensive operations!
Shrink the working set!
Simple measures, large impact
Elimination of common subexpressions
Avoiding branches
Using SIMD instruction sets
The role of compilers
General optimization options
Inlining
Aliasing
Computational accuracy
Register optimizations
Using compiler logs
C++ optimizations
Temporaries
Dynamic memory management
Loop kernels and iterators
Data access optimization
Balance analysis and lightspeed estimates
Bandwidth-based performance modeling
The STREAM benchmarks
Storage order
Case study: The Jacobi algorithm
Case study: Dense matrix transpose
Algorithm classification and access optimizations
O(N)/O(N)
O(N<sup>2</sup>)/O(N<sup>2</sup>)
O(N<sup>3</sup>)/O(N<sup>2</sup>)
Case study: Sparse matrix-vector multiply
Sparse matrix storage schemes
Optimizing JDS sparse MVM
Parallel computers
Taxonomy of parallel computing paradigms
Shared-memory computers
Cache coherence
UMA
ccNUMA
Distributed-memory computers
Hierarchical (hybrid) systems
Networks
Basic performance characteristics of networks
Buses
Switched and fat-tree networks
Mesh networks
Hybrids
Basics of parallelization
Why parallelize?
Parallelism
Data parallelism
Functional parallelism
Parallel scalability
Factors that limit parallel execution
Scalability metrics
Simple scalability laws
Parallel efficiency
Serial performance versus strong scalability
Refined performance models
Choosing the right scaling baseline
Case study: Can slower processors compute faster?
Load imbalance
Shared-memory parallel programming with OpenMP
Short introduction to OpenMP
Parallel execution
Data scoping
OpenMP worksharing for loops
Synchronization
Reductions
Loop scheduling
Tasking
Miscellaneous
Case study: OpenMP-parallel Jacobi algorithm
Advanced OpenMP: Wavefront parallelization
Efficient OpenMP programming
Profiling OpenMP programs
Performance pitfalls
Ameliorating the impact of OpenMP worksharing constructs
Determining OpenMP overhead for short loops
Serialization
False sharing
Case study: Parallel sparse matrix-vector multiply
Locality optimizations on ccNUMA architectures
Locality of access on ccNUMA
Page placement by first touch
Access locality by other means
Case study: ccNUMA optimization of sparse MVM
Placement pitfalls
NUMA-unfriendly OpenMP scheduling
File system cache
ccNUMA issues with C++
Arrays of objects
Standard Template Library
Distributed-memory parallel programming with MPI
Message passing
A short introduction to MPI
A simple example
Messages and point-to-point communication
Collective communication
Nonblocking point-to-point communication
Virtual topologies
Example: MPI parallelization of a Jacobi solver
MPI implementation
Performance properties
Efficient MPI programming
MPI performance tools
Communication parameters
Synchronization, serialization, contention
Implicit serialization and synchronization
Contention
Reducing communication overhead
Optimal domain decomposition
Aggregating messages
Nonblocking vs. asynchronous communication
Collective communication
Understanding intranode point-to-point communication
Hybrid parallelization with MPI and OpenMP
Basic MPI/OpenMP programming models
Vector mode implementation
Task mode implementation
Case study: Hybrid Jacobi solver
MPI taxonomy of thread interoperability
Hybrid decomposition and mapping
Potential benefits and drawbacks of hybrid programming
Topology and affinity in multicore environments
Topology
Thread and process placement
External affinity control
Affinity under program control
Page placement beyond first touch
Solutions to the problems
Bibliography
Index
×
Free shipping on orders over $35*

*A minimum purchase of $35 is required. Shipping is provided via FedEx SmartPost® and FedEx Express Saver®. Average delivery time is 1 – 5 business days, but is not guaranteed in that timeframe. Also allow 1 - 2 days for processing. Free shipping is eligible only in the continental United States and excludes Hawaii, Alaska and Puerto Rico. FedEx service marks used by permission."Marketplace" orders are not eligible for free or discounted shipping.

Learn more about the TextbookRush Marketplace.

×