| |
| |
Preface | |
| |
| |
| |
Requirements and General Issues | |
| |
| |
| |
Cluster Computing at a Glance | |
| |
| |
| |
Introduction | |
| |
| |
| |
Scalable Parallel Computer Architectures | |
| |
| |
| |
Towards Low Cost Parallel Computing and Motivations | |
| |
| |
| |
Windows of Opportunity | |
| |
| |
| |
A Cluster Computer and its Architecture | |
| |
| |
| |
Clusters Classifications | |
| |
| |
| |
Commodity Components for Clusters | |
| |
| |
| |
Network Services/Communication SW | |
| |
| |
| |
Cluster Middleware and Single System Image (SSI) | |
| |
| |
| |
Resource Management and Scheduling (RMS) | |
| |
| |
| |
Programming Environments and Tools | |
| |
| |
| |
Cluster Applications | |
| |
| |
| |
Representative Cluster Systems | |
| |
| |
| |
Cluster of SMPs (CLUMPS) | |
| |
| |
| |
Summary and Conclusions | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Cluster Setup and its Administration | |
| |
| |
| |
Introduction | |
| |
| |
| |
Setting up the Cluster | |
| |
| |
| |
Security | |
| |
| |
| |
System Monitoring | |
| |
| |
| |
System Tuning | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Constructing Scalable Services | |
| |
| |
| |
Introduction | |
| |
| |
| |
Environment | |
| |
| |
| |
Resource Sharing | |
| |
| |
| |
Resource Sharing Enhanced Locality | |
| |
| |
| |
Prototype Implementation and Extension | |
| |
| |
| |
Conclusions and Future Study | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Dependable Clustered Computing | |
| |
| |
| |
Introduction | |
| |
| |
| |
Two Worlds Converge | |
| |
| |
| |
Dependability Concepts | |
| |
| |
| |
Cluster Architectures | |
| |
| |
| |
Detecting and Masking Faults | |
| |
| |
| |
Recovering from Faults | |
| |
| |
| |
The Practice of Dependable Clustered Computing | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Deploying a High Throughput Computing Cluster | |
| |
| |
| |
Introduction | |
| |
| |
| |
Condor Overview | |
| |
| |
| |
Software Development | |
| |
| |
| |
System Administration | |
| |
| |
| |
Summary | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Performance Models and Simulation | |
| |
| |
| |
Introduction | |
| |
| |
| |
New Performance Issues | |
| |
| |
| |
A Cost Model for Effective Parallel Computing | |
| |
| |
| |
Conclusions | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Metacomputing: Harnessing Informal Supercomputers | |
| |
| |
| |
General Introduction | |
| |
| |
| |
The Evolution of Metacomputing | |
| |
| |
| |
Metacomputer Design Objectives and Issues | |
| |
| |
| |
Metacomputing Projects | |
| |
| |
| |
Emerging Metacomputing Environments | |
| |
| |
| |
Summary and Conclusions | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Specifying Resources and Services in Metacomputing Systems | |
| |
| |
| |
The Need for Resource Description Tools | |
| |
| |
| |
Schemes for Specifying Hardware and Software Resources | |
| |
| |
| |
Resource and Service Description (RSD) | |
| |
| |
| |
Summary | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Networking, Protocols, and I/O | |
| |
| |
| |
High Speed Networks | |
| |
| |
| |
Introduction | |
| |
| |
| |
Design Issues | |
| |
| |
| |
Fast Ethernet | |
| |
| |
| |
High Performance Parallel Interface (HiPPI) | |
| |
| |
| |
Asynchronous Transfer Mode (ATM) | |
| |
| |
| |
Scalable Coherent Interface (SCI) | |
| |
| |
| |
ServerNet | |
| |
| |
| |
Myrinet | |
| |
| |
| |
Memory Channel | |
| |
| |
| |
Synfinity | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Lightweight Messaging Systems | |
| |
| |
| |
Introduction | |
| |
| |
| |
Latency/Bandwidth Evaluation of Communication Performance | |
| |
| |
| |
Traditional Communication Mechanisms for Clusters | |
| |
| |
| |
Lightweight Communication Mechanisms | |
| |
| |
| |
Kernel-Level Lightweight Communications | |
| |
| |
| |
User-Level Lightweight Communications | |
| |
| |
| |
A Comparison Among Message Passing Systems | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Active Messages | |
| |
| |
| |
Introduction | |
| |
| |
| |
Requirements | |
| |
| |
| |
AM Programming Model | |
| |
| |
| |
AM Implementation | |
| |
| |
| |
Analysis | |
| |
| |
| |
Programming Models on AM | |
| |
| |
| |
Future Work | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Xpress Transport Protocol | |
| |
| |
| |
Network Services for Cluster Computing | |
| |
| |
| |
A New Approach | |
| |
| |
| |
XTP Functionality | |
| |
| |
| |
Performance | |
| |
| |
| |
Applications | |
| |
| |
| |
XTP's Future in Cluster Computing | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Congestion Management in ATM Clusters | |
| |
| |
| |
Introduction to ATM Networking | |
| |
| |
| |
Existing Methodologies | |
| |
| |
| |
Simulation of ATM on LAN | |
| |
| |
| |
Migration Planning | |
| |
| |
| |
Conclusions | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Load Balancing Over Networks | |
| |
| |
| |
Introduction | |
| |
| |
| |
Methods | |
| |
| |
| |
Common Errors | |
| |
| |
| |
Practical Implementations | |
| |
| |
| |
Summary | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Multiple Path Communication | |
| |
| |
| |
Introduction | |
| |
| |
| |
Heterogeneity in Networks and Applications | |
| |
| |
| |
Multiple Path Communication | |
| |
| |
| |
Case Study | |
| |
| |
| |
Summary and Conclusion | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Network RAM | |
| |
| |
| |
Introduction | |
| |
| |
| |
Remote Memory Paging | |
| |
| |
| |
Network Memory File Systems | |
| |
| |
| |
Applications of Network RAM in Databases | |
| |
| |
| |
Summary | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Distributed Shared Memory | |
| |
| |
| |
Introduction | |
| |
| |
| |
Data Consistency | |
| |
| |
| |
Network Performance Issues | |
| |
| |
| |
Other Design Issues | |
| |
| |
| |
Conclusions | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Parallel I/O for Clusters: Methodologies and Systems | |
| |
| |
| |
Introduction | |
| |
| |
| |
A Case for Cluster I/O Systems | |
| |
| |
| |
The Parallel I/O Problem | |
| |
| |
| |
File Abstraction | |
| |
| |
| |
Methods and Techniques | |
| |
| |
| |
Architectures and Systems | |
| |
| |
| |
The ViPIOS Approach | |
| |
| |
| |
Conclusions and Future Trends | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Software RAID and Parallel Filesystems | |
| |
| |
| |
Introduction | |
| |
| |
| |
Physical Placement of Data | |
| |
| |
| |
Caching | |
| |
| |
| |
Prefetching | |
| |
| |
| |
Interfaces | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Process Scheduling, Load Sharing, and Balancing | |
| |
| |
| |
Job and Resource Management Systems | |
| |
| |
| |
Motivation and Historical Evolution | |
| |
| |
| |
Components and Architecture of Job- and Resource Management Systems | |
| |
| |
| |
The State-of-the-Art in RMS | |
| |
| |
| |
Challenges for the Present and the Future | |
| |
| |
| |
Summary | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Scheduling Parallel Jobs on Clusters | |
| |
| |
| |
Introduction | |
| |
| |
| |
Background | |
| |
| |
| |
Rigid Jobs with Process Migration | |
| |
| |
| |
Malleable Jobs with Dynamic Parallelism | |
| |
| |
| |
Communication-Based Coscheduling | |
| |
| |
| |
Batch Scheduling | |
| |
| |
| |
Summary | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Load Sharing and Fault Tolerance Manager | |
| |
| |
| |
Introduction | |
| |
| |
| |
Load Sharing in Cluster Computing | |
| |
| |
| |
Fault Tolerance by Means of Checkpointing | |
| |
| |
| |
Integration of Load Sharing and Fault Tolerance | |
| |
| |
| |
Related Works | |
| |
| |
| |
Conclusion | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Parallel Program Scheduling Techniques | |
| |
| |
| |
Introduction | |
| |
| |
| |
The Scheduling Problem for Network Computing Environments | |
| |
| |
| |
Scheduling Tasks to Machines Connected via Fast Networks | |
| |
| |
| |
Scheduling Tasks to Arbitrary Processors Networks | |
| |
| |
| |
CASCH: A Parallelization and Scheduling Tool | |
| |
| |
| |
Summary and Concluding Remarks | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Customized Dynamic Load Balancing | |
| |
| |
| |
Introduction | |
| |
| |
| |
Dynamic Load Balancing (DLB) | |
| |
| |
| |
DLB Modeling and Decision Process | |
| |
| |
| |
Compiler and Runtime Systems | |
| |
| |
| |
Experimental Results | |
| |
| |
| |
Summary | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Mapping and Scheduling on Heterogeneous Systems | |
| |
| |
| |
Introduction | |
| |
| |
| |
Mapping and Scheduling | |
| |
| |
| |
The Issues of Task Granularity and Partitioning | |
| |
| |
| |
Static Scheduling and Dynamic Scheduling | |
| |
| |
| |
Load Balancing Issues | |
| |
| |
| |
Summary | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Representative Cluster Systems | |
| |
| |
| |
Beowulf | |
| |
| |
| |
Searching for Beowulf | |
| |
| |
| |
System Architecture Evolution | |
| |
| |
| |
Prevailing Software Practices | |
| |
| |
| |
Next Steps in Beowulf-Class Computing | |
| |
| |
| |
Beowulf in the 21st Century | |
| |
| |
| |
Bibliography | |
| |
| |
| |
RWC PC Cluster II and SCore Cluster System Software | |
| |
| |
| |
Introduction | |
| |
| |
| |
Building a Compact PC Cluster Using Commodity Hardware | |
| |
| |
| |
SCore Parallel Operating System Environment on Top of Unix | |
| |
| |
| |
Performance Evaluation | |
| |
| |
| |
Concluding Remarks | |
| |
| |
| |
Bibliography | |
| |
| |
| |
COMPaS: A Pentium Pro PC-Based SMP Cluster | |
| |
| |
| |
COMPaS: A Pentium Pro PC-Based SMP Cluster | |
| |
| |
| |
Building PC-Based SMP Cluster | |
| |
| |
| |
Programming for SMP Cluster | |
| |
| |
| |
Case Studies - Benchmarks Results on COMPaS | |
| |
| |
| |
Guidelines for Programming in PC-Based SMP Cluster | |
| |
| |
| |
Summary | |
| |
| |
| |
Bibliography | |
| |
| |
| |
The NanOS Cluster Operating System | |
| |
| |
| |
Introduction | |
| |
| |
| |
Architecture Overview | |
| |
| |
| |
NanOS | |
| |
| |
| |
MCMM | |
| |
| |
| |
Hidra | |
| |
| |
| |
Summary | |
| |
| |
| |
Bibliography | |
| |
| |
| |
BSP-Based Adaptive Parallel Processing | |
| |
| |
| |
Introduction | |
| |
| |
| |
The Bulk-Synchronous Parallel Model | |
| |
| |
| |
Parallel Computing on Nondedicated Workstations | |
| |
| |
| |
Adaptive Parallelism in the BSP Model | |
| |
| |
| |
A Programming Environment for Adaptive BSP | |
| |
| |
| |
Application of A-BSP to Parallel Computations | |
| |
| |
| |
Application of A-BSP to Nondedicated Workstations | |
| |
| |
| |
Conclusions | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Mars: An Adaptive Parallel Programming Environment | |
| |
| |
| |
Motivation and Goals | |
| |
| |
| |
Related Work | |
| |
| |
| |
The Available Capacity of NOWs | |
| |
| |
| |
The MARS Approach | |
| |
| |
| |
Experimental Results | |
| |
| |
| |
Conclusion and Future Work | |
| |
| |
| |
Bibliography | |
| |
| |
| |
The Gardens Approach to Adaptive Parallel Computing | |
| |
| |
| |
Introduction | |
| |
| |
| |
Related Work | |
| |
| |
| |
Communication | |
| |
| |
| |
Adaptation and Tasking | |
| |
| |
| |
Performance Results | |
| |
| |
| |
Summary | |
| |
| |
| |
Bibliography | |
| |
| |
| |
The ParPar System: A Software MPP | |
| |
| |
| |
Introduction | |
| |
| |
| |
The ParPar System | |
| |
| |
| |
System Configuration and Control | |
| |
| |
| |
Job Control | |
| |
| |
| |
Scheduling | |
| |
| |
| |
Parallel I/O | |
| |
| |
| |
Project Status | |
| |
| |
| |
Bibliography | |
| |
| |
| |
Pitt Parallel Computer | |
| |
| |
| |
Introduction | |
| |
| |
| |
The Operating System | |
| |
| |
| |
The Laplace Problem | |
| |
| |
| |
Technical Description of the Laplace Program | |
| |
| |
| |
User Description of the Laplace Operating System | |
| |
| |
| |
Linear Simultaneous Equations | |
| |
| |
| |
An Example Application | |
| |
| |
| |
Summary | |
| |
| |
| |
Bibliography | |
| |
| |
| |
The RS/6000 SP System: A Scalable Parallel Cluster | |
| |
| |
| |
Dual Personalities | |
| |
| |
| |
SP System Architecture | |
| |
| |
| |
SP System Structure | |
| |
| |
| |
Concluding Remarks | |
| |
| |
| |
Bibliography | |
| |
| |
| |
A Scalable and Highly Available Clustered Web Server | |
| |
| |
| |
Introduction | |
| |
| |
| |
Web Servers and Dynamic Content | |
| |
| |
| |
Fine-Grain Load Balancing | |
| |
| |
| |
Shared Filesystems and Scalable I/O | |
| |
| |
| |
Scalable Database Access on the Web | |
| |
| |
| |
High Availability | |
| |
| |
| |
Conclusions | |
| |
| |
| |
Bibliography | |
| |
| |
Index | |