Skip to content

High Performance Cluster Computing Architectures and Systems

Best in textbook rentals since 2012!

ISBN-10: 0130137847

ISBN-13: 9780130137845

Edition: 1999

Authors: Rajkumar Buyya

List price: $84.67
Blue ribbon 30 day, 100% satisfaction guarantee!
what's this?
Rush Rewards U
Members Receive:
Carrot Coin icon
XP icon
You have reached 400 XP and carrot coins. That is the daily max!

A comprehensive guide to state-of-the art clusters technology for professionals and graduate-level CS students. Covers hardware as well as software issues, including topics such as: active messages and lightweight protocols, message passing systems. Cluster computers can deliver better performance than traditional mainframes and super computers at a much lower hardware cost.
Customers also bought

Book details

List price: $84.67
Copyright year: 1999
Publisher: Prentice Hall PTR
Publication date: 5/21/1999
Binding: Paperback
Pages: 896
Size: 7.50" wide x 10.00" long x 2.00" tall
Weight: 3.234

Dr. Rajkumar Buyya is Professor of Computer Science and Software Engineering; and Director of the Cloud Computing and Distributed Systems (CLOUDS) Laboratory at the University of Melbourne, Australia. He is also serving as the founding CEO of Manjrasoft Pty Ltd., a spin-off company of the University, commercialising its innovations in Grid and Cloud Computing. He received "Research Excellence Awards" from the University of Melbourne for productive and quality research in computer science and software engineering in 2005 and 2008, and the "2009 IEEE Medal for Excellence in Scalable Computing" for pioneering the economic paradigm for utility-oriented distributed computing platforms such as…    

Preface
Requirements and General Issues
Cluster Computing at a Glance
Introduction
Scalable Parallel Computer Architectures
Towards Low Cost Parallel Computing and Motivations
Windows of Opportunity
A Cluster Computer and its Architecture
Clusters Classifications
Commodity Components for Clusters
Network Services/Communication SW
Cluster Middleware and Single System Image (SSI)
Resource Management and Scheduling (RMS)
Programming Environments and Tools
Cluster Applications
Representative Cluster Systems
Cluster of SMPs (CLUMPS)
Summary and Conclusions
Bibliography
Cluster Setup and its Administration
Introduction
Setting up the Cluster
Security
System Monitoring
System Tuning
Bibliography
Constructing Scalable Services
Introduction
Environment
Resource Sharing
Resource Sharing Enhanced Locality
Prototype Implementation and Extension
Conclusions and Future Study
Bibliography
Dependable Clustered Computing
Introduction
Two Worlds Converge
Dependability Concepts
Cluster Architectures
Detecting and Masking Faults
Recovering from Faults
The Practice of Dependable Clustered Computing
Bibliography
Deploying a High Throughput Computing Cluster
Introduction
Condor Overview
Software Development
System Administration
Summary
Bibliography
Performance Models and Simulation
Introduction
New Performance Issues
A Cost Model for Effective Parallel Computing
Conclusions
Bibliography
Metacomputing: Harnessing Informal Supercomputers
General Introduction
The Evolution of Metacomputing
Metacomputer Design Objectives and Issues
Metacomputing Projects
Emerging Metacomputing Environments
Summary and Conclusions
Bibliography
Specifying Resources and Services in Metacomputing Systems
The Need for Resource Description Tools
Schemes for Specifying Hardware and Software Resources
Resource and Service Description (RSD)
Summary
Bibliography
Networking, Protocols, and I/O
High Speed Networks
Introduction
Design Issues
Fast Ethernet
High Performance Parallel Interface (HiPPI)
Asynchronous Transfer Mode (ATM)
Scalable Coherent Interface (SCI)
ServerNet
Myrinet
Memory Channel
Synfinity
Bibliography
Lightweight Messaging Systems
Introduction
Latency/Bandwidth Evaluation of Communication Performance
Traditional Communication Mechanisms for Clusters
Lightweight Communication Mechanisms
Kernel-Level Lightweight Communications
User-Level Lightweight Communications
A Comparison Among Message Passing Systems
Bibliography
Active Messages
Introduction
Requirements
AM Programming Model
AM Implementation
Analysis
Programming Models on AM
Future Work
Bibliography
Xpress Transport Protocol
Network Services for Cluster Computing
A New Approach
XTP Functionality
Performance
Applications
XTP's Future in Cluster Computing
Bibliography
Congestion Management in ATM Clusters
Introduction to ATM Networking
Existing Methodologies
Simulation of ATM on LAN
Migration Planning
Conclusions
Bibliography
Load Balancing Over Networks
Introduction
Methods
Common Errors
Practical Implementations
Summary
Bibliography
Multiple Path Communication
Introduction
Heterogeneity in Networks and Applications
Multiple Path Communication
Case Study
Summary and Conclusion
Bibliography
Network RAM
Introduction
Remote Memory Paging
Network Memory File Systems
Applications of Network RAM in Databases
Summary
Bibliography
Distributed Shared Memory
Introduction
Data Consistency
Network Performance Issues
Other Design Issues
Conclusions
Bibliography
Parallel I/O for Clusters: Methodologies and Systems
Introduction
A Case for Cluster I/O Systems
The Parallel I/O Problem
File Abstraction
Methods and Techniques
Architectures and Systems
The ViPIOS Approach
Conclusions and Future Trends
Bibliography
Software RAID and Parallel Filesystems
Introduction
Physical Placement of Data
Caching
Prefetching
Interfaces
Bibliography
Process Scheduling, Load Sharing, and Balancing
Job and Resource Management Systems
Motivation and Historical Evolution
Components and Architecture of Job- and Resource Management Systems
The State-of-the-Art in RMS
Challenges for the Present and the Future
Summary
Bibliography
Scheduling Parallel Jobs on Clusters
Introduction
Background
Rigid Jobs with Process Migration
Malleable Jobs with Dynamic Parallelism
Communication-Based Coscheduling
Batch Scheduling
Summary
Bibliography
Load Sharing and Fault Tolerance Manager
Introduction
Load Sharing in Cluster Computing
Fault Tolerance by Means of Checkpointing
Integration of Load Sharing and Fault Tolerance
Related Works
Conclusion
Bibliography
Parallel Program Scheduling Techniques
Introduction
The Scheduling Problem for Network Computing Environments
Scheduling Tasks to Machines Connected via Fast Networks
Scheduling Tasks to Arbitrary Processors Networks
CASCH: A Parallelization and Scheduling Tool
Summary and Concluding Remarks
Bibliography
Customized Dynamic Load Balancing
Introduction
Dynamic Load Balancing (DLB)
DLB Modeling and Decision Process
Compiler and Runtime Systems
Experimental Results
Summary
Bibliography
Mapping and Scheduling on Heterogeneous Systems
Introduction
Mapping and Scheduling
The Issues of Task Granularity and Partitioning
Static Scheduling and Dynamic Scheduling
Load Balancing Issues
Summary
Bibliography
Representative Cluster Systems
Beowulf
Searching for Beowulf
System Architecture Evolution
Prevailing Software Practices
Next Steps in Beowulf-Class Computing
Beowulf in the 21st Century
Bibliography
RWC PC Cluster II and SCore Cluster System Software
Introduction
Building a Compact PC Cluster Using Commodity Hardware
SCore Parallel Operating System Environment on Top of Unix
Performance Evaluation
Concluding Remarks
Bibliography
COMPaS: A Pentium Pro PC-Based SMP Cluster
COMPaS: A Pentium Pro PC-Based SMP Cluster
Building PC-Based SMP Cluster
Programming for SMP Cluster
Case Studies - Benchmarks Results on COMPaS
Guidelines for Programming in PC-Based SMP Cluster
Summary
Bibliography
The NanOS Cluster Operating System
Introduction
Architecture Overview
NanOS
MCMM
Hidra
Summary
Bibliography
BSP-Based Adaptive Parallel Processing
Introduction
The Bulk-Synchronous Parallel Model
Parallel Computing on Nondedicated Workstations
Adaptive Parallelism in the BSP Model
A Programming Environment for Adaptive BSP
Application of A-BSP to Parallel Computations
Application of A-BSP to Nondedicated Workstations
Conclusions
Bibliography
Mars: An Adaptive Parallel Programming Environment
Motivation and Goals
Related Work
The Available Capacity of NOWs
The MARS Approach
Experimental Results
Conclusion and Future Work
Bibliography
The Gardens Approach to Adaptive Parallel Computing
Introduction
Related Work
Communication
Adaptation and Tasking
Performance Results
Summary
Bibliography
The ParPar System: A Software MPP
Introduction
The ParPar System
System Configuration and Control
Job Control
Scheduling
Parallel I/O
Project Status
Bibliography
Pitt Parallel Computer
Introduction
The Operating System
The Laplace Problem
Technical Description of the Laplace Program
User Description of the Laplace Operating System
Linear Simultaneous Equations
An Example Application
Summary
Bibliography
The RS/6000 SP System: A Scalable Parallel Cluster
Dual Personalities
SP System Architecture
SP System Structure
Concluding Remarks
Bibliography
A Scalable and Highly Available Clustered Web Server
Introduction
Web Servers and Dynamic Content
Fine-Grain Load Balancing
Shared Filesystems and Scalable I/O
Scalable Database Access on the Web
High Availability
Conclusions
Bibliography
Index