Table of contents for Building clustered Linux systems / Robert W. Lucke.

Bibliographic record and links to related information available from the Library of Congress catalog.

Note: Contents data are machine generated based on pre-publication provided by the publisher. Contents may have variations from the printed book or be incomplete or contain other coding.


Counter
CONTENTS
List of Figures xv
Preface xxiii
About This Book xxiv
Notation and Conventions xxiv
Using This Book xxiv
Acknowledgments xxv
Production Information xxv
Introduction xxvii
Chapter 1 Parallel Power: Defining The Clustered System
Approach 1
1.1 Avoiding Difficulties with the Word "Cluster" 1
1.2 Defining the Word "Cluster" 2
1.3 The Evolution of a Clustered Solution 2
1.3.1 Uni-Processor Systems 3
1.3.2 Symmetric Multi-Processor Systems 5
1.3.3 Networks of Independent Systems 5
1.4 Collapsed Network Computing for Engineering 14
1.5 Scientific Cluster Computing 15
1.5.1 An Example Parallel Problem 15
1.5.2 Refining the Parallel Example 18
1.5.3 Software Communication Facilities 19
1.5.4 High Speed Interconnect 20
1.6 Revisiting the Definition of "Cluster" 21
1.7 Commercial Cluster Computing 21
1.8 High-Performance, High-Throughput, and High-Availability 22
1.9 A Formal Definition of "Cluster" 22
1.10 The Why and Wherefore of Clusters 23
1.11 Summary 23
Chapter 2 One Step At A Time: A Process For Building
Clusters 25
2.1 Building Clusters As A Complex Endeavor 25
2.2 Talking About The "P-Word" 27
2.3 Presenting A Formal Cluster Creation Process 28
2.3.1 Phase One: Cluster Solution Design 29
2.3.2 Phase Two: Cluster Installation 32
2.3.3 Phase Three: Cluster Testing 38
2.4 Formal Cluster Process Summary 40
Part 1 Cluster Architecture and Hardware Components 41
Chapter 3 Underneath the Hood: Cluster Hardware
Components and Architecture 43
3.1 Hardware Categories in a Cluster 44
3.1.1 Passive Hardware Elements In A Cluster 44
3.1.2 Active Hardware Elements In A Cluster 45
3.2 Cluster Resources And The "Outside" World 47
3.3 A Survey of Cluster Hardware Configurations 50
3.4 High-Throughput Cluster Configurations 50
3.4.1 A "Carpet" Cluster 50
3.4.2 Compute "Farms and Ranches" 52
3.5 High-Availability Cluster Configurations 55
3.5.1 An Example "Virtual" Web Server 56
3.5.2 A Parallel Database Server 58
3.6 High-Performance Cluster Configurations 59
3.6.1 A Visualization Cluster 61
3.6.2 High-Performance Parallel Application Configurations 64
3.7 Common Cluster Hardware Architecture 66
3.8 Cluster Hardware Architecture Summary 67
Chapter 4 Any Way You Slice It: Work and Master Nodes in a
Cluster 69
4.1 Criteria For Selecting Compute-slices 70
4.2 An Example Compute-Slice From Hewlett-Packard 70
4.2.1 Analysis of the Example Compute-Slice 73
4.2.2 Comparing the Example Compute-Slice to Similar Systems 77
4.2.3 Example Clusters Using Our Compute-Slices 81
4.3 Thirty-two Bit and 64-Bit Compute-Slices 82
4.3.1 Physical RAM Addressing 82
4.3.2 Process Virtual Address Space 83
4.3.3 Software Implications of 64-Bit Hardware 85
4.4 Memory Bandwidth 86
4.5 Memory and Cache Latency 87
4.6 Number of Processors in a Compute-Slice 89
4.7 I/O Interface Capacity and Performance 90
4.7.1 Peripheral Component Interconnect Implementation 90
4.7.2 Accelerated Graphics Port 91
4.8 Compute-Slice Operating System Support 92
4.9 Master Node Characteristics 92
4.10 Compute-Slice and Master Node Summary 94
Chapter 5 Packet In: Cluster Networking Basics And Example
Devices 95
5.1 A Short View of Ethernet Networking History 95
5.2 The OSI Communication Model 96
5.3 Ethernet Network Topologies 97
5.3.1 Ethernet Frames 98
5.3.2 Ethernet Hubs 99
5.3.3 Network Routers 100
5.4 Internet Protocol and Addressing 102
5.4.1 IP and TCP/UDP 102
5.4.2 IP Addressing 103
5.4.3 IP Subnetting 105
5.4.4 IP Supernetting 106
5.4.5 Ethernet Unicast, Multicast, and Broadcast Frames 107
5.4.6 Address Resolution Protocol 108
5.4.7 IPV4 and IPV6 108
5.4.8 Private, Non-Routable Network Addresses 109
5.5 Ethernet Switching Technology 110
5.5.1 Half and Full Duplex Operation 110
5.5.2 Store and Forward versus Cut-Through Switching 111
5.5.3 Collision Domains and Switching 111
5.5.4 Link Aggregation 113
5.5.5 Virtual Local Area Networks 114
5.5.6 Jumbo Frames 115
5.5.7 Managed Versus Unmanaged Switches 116
5.6 Example Switches 116
5.6.1 A Gigabit Ethernet Edge Switch 116
5.6.2 Ethernet Core Switches 117
5.7 Ethernet Networking Summary 119
Chapter 6 Tying It Together: Cluster Data, Management, and
Control Networks 121
6.1 Networked System Management and Serial Port Access 121
6.1.1 Remote System Management Access 122
6.1.2 Keyboard, Video, Mouse Switches 123
6.1.3 Serial Port Concentrators or Switches 123
6.2 Cluster Ethenet Network Design 125
6.2.1 Choosing A Cluster-Wide IP Address Scheme 125
6.2.2 IP Addressing Conventions 125
6.2.3 Using Non-Routable Network Addresses 126
6.3 An Example Cluster Ethernet Network Design 127
6.3.1 Choosing the Type of Network and Address Ranges 127
6.3.2 Device Addressing Schemes 128
6.3.3 The Management and Control Networks 128
6.3.4 The Data Network 130
6.3.5 Example IP Address Assignments 133
6.4 Cluster Network Design Summary 133
Chapter 7 Life in the Fast LAN: High-Speed Interconnects and
Your Cluster 135
7.1 High-Speed Interconnects 135
7.2 High-Speed Interconnect Latency and Bandwidth 136
7.3 Examining High-Speed Interconnect Topologies 137
7.3.1 Some Common Topologies 138
7.3.2 Cross-Sectional Bandwidth 139
7.3.3 Clos Networks 140
7.3.4 Fat Tree Networks 141
7.4 Ethernet for HSI 142
7.4.1 An Example Ethernet HSI Network 143
7.4.2 Direct-Attach Example Bandwidth 145
7.4.3 Multi-level Attach Example Bandwidth 145
7.4.4 A Larger Ethernet HSI Example 146
7.4.5 Other Ethernet HSI Configurations 147
7.5 Myricom's Myrinet HSI 148
7.6 Infiniband 151
7.7 Dolphin 154
7.8 Quadrics QsNet 155
7.9 HSI Technology Summary and Comparison 156
Part 2 Cluster Software Components and Architecture 159
Chapter 8 The Right Stuff: Linux As The Basis For Clusters
163
8.1 Choosing a Cluster Operating System 163
8.1.1 Hardware Support 164
8.1.2 Operating System Stability 164
8.1.3 Software License Costs 164
8.1.4 Manageability 165
8.1.5 Software Flexibility 165
8.1.6 Openness 166
8.1.7 Scalability 166
8.1.8 Software Availability and Cost 166
8.1.9 Multiple Support Options 167
8.2 Introducing the Linux Operating System and Licensing 167
8.2.1 Linux As "Free" Software 167
8.3 Linux Distributions 169
8.4 Managing Open-source Software "Churn" 170
8.5 Commercial Linux Distributions 171
8.5.1 Red Hat Linux 171
8.5.2 SUSE Linux 174
8.5.3 Conclusions About Commercial Linux Distributions 176
8.6 Free Linux Distributions 177
8.6.1 The Fedora Project 177
8.6.2 Debian Linux 178
8.6.3 Conclusions About Free Distributions 178
8.7 Conclusions About Linux for Clusters 179
Chapter 9 Round and Round It Goes: Booting, Disks,
Partitioning, and Local File Systems 181
9.1 Disk Partitioning, Booting, and the BIOS 181
9.1.1 Default Disk Partitioning 182
9.1.2 A Brief Note on IA-64 Disk Partitioning 187
9.1.3 Red Hat Linux Boot Loaders 188
9.2 Booting the Linux Kernel 190
9.2.1 The Linux Initial RAM Disk Image 191
9.3 Linux Local Disk Storage 194
9.3.1 Using The Software RAID 5 Facility 194
9.3.2 Using Software RAID 1 for System Disks 198
9.3.3 RAID Multipath 201
9.3.4 Recovering From Software RAID Failures 202
9.4 Linux File System Types 210
9.5 The Linux /proc and devfs Pseudo-File Systems 211
9.6 The Linux "ext2" and "ext3" Physical File Systems 213
9.6.1 File System Volume Labels 215
9.6.2 Creating The Example ext3 File System 216
9.6.3 The "ext" File System Stride Option For RAID 218
9.7 Standard Mount Options For All File Systems 219
9.8 The Temporary File System 220
9.9 Other Available File System Types 220
9.10 Advanced Performance Tuning 221
9.11 A Word About SMART Monitoring for Disks 221
9.12 Local Disks and File Systems Summary 223
Chapter 10 Supporting Role: Infrastructure Services and
Administration 225
10.1 The Big Infrastructure Picture 225
10.2 Initializing Your Cluster's Software Infrastructure 227
10.3 Infrastructure Implementation Recommendations 228
10.3.1 Avoiding Service Interference 228
10.3.2 Redundant Copies of Essential Services 229
10.3.3 Services With Fall-Back Capabilities 230
10.3.4 Single-point Administration 230
10.3.5 Choosing Efficient Services 231
10.3.6 Management of Configuration Information 232
10.4 Protecting Active Configuration Information 233
10.5 Preparation For Infrastructure Installation 233
10.5.1 Order of Installation 234
10.5.2 Steps For Installing Infrastructure Services 234
10.5.3 Loading the Linux Operating System Distribution 238
10.6 Networking 239
10.6.1 Configuring Ethernet Switching Equipment 239
10.6.2 Network Aliases 240
10.6.3 Channel Bonding 244
10.6.4 Setting The Ethernet Link MTU Size 247
10.6.5 The Media Independent Interface Tool 248
10.7 Enabling and Starting Linux Services 249
10.8 Time Synchronization 250
10.9 Name Services 252
10.9.1 Host Naming Conventions 253
10.9.2 The Name Service Switch File 254
10.9.3 The Hosts File 255
10.9.4 The Domain Name Service 256
10.9.5 The Network Information Service 262
10.9.6 Name Resolution Recommendations 269
10.10 Infrastructure Services Summary 271
Chapter 11 Reach Out and Access Something: Remote Access
Services, DHCP, and System Logging 273
11.1 Continuing Infrastructure Installation 273
11.2 "Traditional" User Login and Authentication 274
11.2.1 Using Groups and Directory Permissions 276
11.2.2 Distributing Password Information with NIS 278
11.2.3 Introducing Kerberos 278
11.2.4 Configuring a Kerberos KDC on Linux 280
11.2.5 Creating a Kerberos Slave KDC 284
11.2.6 Kerberos Summary 285
11.3 Remote Access Services 287
11.4 Using BSD Remote Access Services 287
11.5 Kerberized Versions of BSD/ARPA Remote Services 288
11.6 The Secure Shell 293
11.6.1 SSH and Public-key Encryption 294
11.6.2 Configuring the SSH Client and Server 295
11.6.3 Configuring User Identity for SSH 296
11.6.4 SSH Host Keys, Known, and Authorized Hosts 298
11.6.5 Using the Authorized Keys File 299
11.6.6 Fine-tuning SSH Access 301
11.6.7 SSH "scp" and "sftp" Commands 302
11.6.8 SSH Forwarding 303
11.6.9 SSH Summary 306
11.7 The Parallel Distributed Shell 306
11.7.1 Getting and Installing PDSH 307
11.7.2 Compiling PDSH To Use SSH 310
11.7.3 Using PDSH in Your Cluster 311
11.7.4 PDSH Summary 313
11.8 Configuring Dynamic Host Configuration Protocol 313
11.8.1 Client-Side DHCP Information 314
11.8.2 Configuring the DHCP Server 316
11.9 Logging System Activity 319
11.9.1 Operation of The System Logging Daemon 320
11.9.2 Kernel Message Logging 322
11.9.3 Enabling Remote Logging 324
11.9.4 Using "logrotate" To Archive Log Files 326
11.9.5 Using "logwatch" Reporting 328
11.9.6 An Example Subsystem Logging Design 330
11.9.7 Linux System Logging Summary 332
11.10 Access and Logging Services Summary 332
Chapter 12 Installment Plan: Introduction to Compute-Slice
Configuration and Installation 333
12.1 Compute-slice Configuration Considerations 333
12.1.1 One-Thousand Pieces Flying in Close Formation 334
12.2 The Single System View 335
12.2.1 Shared System Structure, Individual System Personality 336
12.2.2 Accomplishing Shared System Structure 338
12.2.3 Compute-slice Software Requirements 339
12.3 A Generalized Network Boot Facility, "pxelinux" 340
12.3.1 Configuring TFTP For Booting 341
12.3.2 Configuring the "pxelinux" Software 342
12.3.3 The "pxelinux" Configuration Files 344
12.4 Configuring Network "kickstart" 347
12.4.1 The "kickstart" File Format 348
12.4.2 Making The Install Media Available for "kickstart" 349
12.4.3 The Network "kickstart" Directory 350
12.5 NFS Diskless Configuration 352
12.5.1 The Linux Terminal Server Project 353
12.5.2 Cluster NFS 354
12.6 Introduction to Compute-slice Installation Summary 356
Chapter 13 Improving Your Images: System Installation With
SystemImager 357
13.1 Using the "SystemImager" Software 357
13.1.1 Downloading and Installing SI 358
13.1.2 Configuring the SI Server 362
13.1.3 The SI Cold Installation Boot Process 363
13.1.4 SI Server Commands 364
13.1.5 Installing and Configuring the SI Client Software 364
13.1.6 Capturing A Client Image 366
13.1.7 Forcing Hardware to Driver Mapping with SystemConfigurator 372
13.1.8 Installing A Client Image 372
13.1.9 Updating Client Software Without Re-installing 373
13.1.10 Image Management and Naming 374
13.1.11 Avoiding the Big, MAC Gathering Syndrome 375
13.1.12 Summary 375
13.2 Multicast Installation 377
13.2.1 Multicast Basics 377
13.2.2 An Open-source Multicast Facility, "udpcast" 379
13.2.3 A Simple Multicast Example 380
13.2.4 A More Complex Example 381
13.2.5 Command-line Prototyping With Multicast 382
13.2.6 Prototyping A Network Multicast Installation 383
13.2.7 Making More Modifications 386
13.2.8 Generalizing The Multicast Installation Prototype 392
13.2.9 Triggering A Multicast Installation 396
13.3 The SystemImager "flamethrower" Facility 397
13.3.1 Installing "flamethrower" 398
13.3.2 Activating "flamethrower" 398
13.3.3 Additional SystemImager Functionality in Version 3.2.0 401
13.4 System Installation With SystemImager Summary 401
Chapter 14 To Protect and Serve: Providing Data to Your
Cluster 403
14.1 Introduction to Cluster File Systems 403
14.1.1 Cluster File System Requirements 404
14.1.2 Network File System Access 405
14.1.3 Parallel File System Access 407
14.2 The Network File System 410
14.2.1 Enabling NFS On the Server 411
14.2.2 Adjusting NFS Mount Daemon Protocol Behavior 412
14.2.3 Tuning The NFS Server Network Parameters 415
14.2.4 NFS and TCP Wrappers 416
14.2.5 Exporting File Systems on the NFS Server 417
14.2.6 Starting the NFS Server Subsystem 417
14.2.7 NFS Client Mount Parameters 418
14.2.8 Using "autofs" on NFS Clients 420
14.2.9 NFS Summary 421
14.3 A Survey of Some Open-Source Parallel File Systems 421
14.3.1 The Parallel Virtual File System, PVFS 422
14.3.2 The Open Global File System, OpenGFS 425
14.3.3 The Lustre File System 427
14.4 Commercially Available Cluster File Systems 433
14.4.1 Red Hat Global File System, GFS 434
14.4.2 The PolyServe(tm) Matrix File System 435
14.4.3 Oracle's Cluster File System, OCFS 437
14.5 Cluster File System Summary 437
Chapter 15 Stuck in the Middle: Cluster Middleware 439
15.1 Introduction to Cluster Middleware 439
15.1.1 Describing the Parallel Application Execution Environment 440
15.1.2 The HSI Message Passing Facility 441
15.1.3 Load Balancing or Job Scheduling 441
15.1.4 Cluster Resource Management 443
15.1.5 Custom Scheduling 444
15.1.6 Monitoring, Measuring, and Managing Your Cluster 445
15.2 The MPICH Library 446
15.2.1 Introduction to MPICH 446
15.2.2 Downloading and Installing MPICH 446
15.2.3 Using "mpirun" 448
15.2.4 Special Versions of MPICH 450
15.2.5 MPICH Summary 451
15.3 The Simple Linux Utility for Resource Management 452
15.4 The Maui Scheduler 453
15.4.1 Maui Scheduler Software Architecture 453
15.4.2 Job Scheduling in Maui 455
15.4.3 Maui Scheduler Summary 456
15.5 The Ganglia Distributed Monitoring and Execution System 456
15.5.1 The Ganglia Software Architecture 457
15.5.2 Introducing Round Robin Database Software, "rrdtool" 458
15.5.3 Downloading and Installing Ganglia Software 462
15.5.4 Ganglia's "gmond" and "gmetad" Daemons 463
15.5.5 Adding Your Own Ganglia Metrics 464
15.5.6 Parallel Authentication with "authd" and "gexec" 467
15.5.7 Starting Parallel Programs with "gexec" 468
15.5.8 Ganglia Summary 469
15.6 Monitoring with Nagios 469
15.6.1 Explaining Nagios 470
15.6.2 Downloading and Installing Nagios 470
15.6.3 Configuring the Web Server for Nagios 472
15.6.4 Configuring and Using Nagios 473
15.6.5 Nagios Summary 481
15.7 Cluster Middleware Summary 481
15.7.1 An Afterword on Linux High-Availability and Open-Source 482
Chapter 16 Put Tab A in Slot C: OSCAR, Rocks,
OpenMOSIX, and the Globus Toolkit 485
16.1 Introducing Cluster-Building Toolkits 485
16.1.1 General Installation Process 487
16.2 Installing A Cluster with OSCAR 488
16.2.1 OSCAR Initial Software Installation and Configuration 488
16.2.2 The OSCAR Installation Wizard 489
16.2.3 OSCAR Package Configuration 490
16.2.4 Building An OSCAR Compute-Slice Image 493
16.2.5 Defining and Installing OSCAR Clients 495
16.2.6 Completing the OSCAR Installation 496
16.2.7 Adding and Deleting OSCAR Clients 497
16.2.8 OSCAR Summary 499
16.3 Installing A Cluster with NPACI Rocks 500
16.3.1 Getting The Rocks Software 501
16.3.2 Installing A Cluster Front-End Node Using Rocks 503
16.3.3 Completing The Installation 508
16.3.4 Rocks System Administration 509
16.3.5 Rocks Summary 511
16.4 The OpenMOSIX Project 511
16.4.1 Getting and Installing OpenMOSIX 511
16.4.2 Configuration of OpenMOSIX Clusters 512
16.4.3 OpenMOSIX Summary 512
16.5 Introduction to the Grid Concept 513
16.6 The Globus Toolkit(r) 514
16.6.1 Globus Toolkit Components 514
16.7 Cluster Building Toolkit Summary 516
Part 3 Building and Deploying Your Cluster 517
Chapter 17 Dollars and Sense: Cluster Economics 519
17.1 Initial Perceptions 519
17.2 Setting the Ground Rules 520
17.3 Cluster Cabling and Complexity 521
17.4 Eight Compute-Slice Cluster Hardware Costs 523
17.5 Sixteen Compute-Slice Cluster Hardware Costs 525
17.5.1 Thirty-two Compute-Slice Hardware Costs 525
17.6 Sixty-four Compute-Slice Hardware Costs 527
17.7 One-hundred Twenty-eight Compute-Slice Hardware Costs 529
17.8 The Land Beyond 128 Compute-Slices 529
17.9 Hardware Cost Trends and Analysis 531
17.10 Cluster Economics Summary 533
Chapter 18 Racking Your Brains: Example Cluster Rack
Assembly Steps 537
18.1 Assembly Assumptions 538
18.2 Some "Rules of Thumb" For Physical Cluster Assembly 538
18.3 Detailed Cluster Assembly Steps 539
18.3.1 Physical Rack Assembly 540
18.3.2 Physical Management Rack Assembly 541
18.3.3 Physical Compute Rack Assembly 542
18.3.4 Physical Compute Rack System Installation 542
18.3.5 Physical Rack Final Assembly and Checkout 544
18.3.6 Physical Rack Cleanup 544
18.3.7 Physical Rack Positioning 545
18.3.8 Inter-rack Configuration 545
18.3.9 Final Cluster Hardware Assembly and Check-out 546
18.4 Learning From The Example Steps 547
18.4.1 Finding Efficiencies In Cluster Construction 547
18.4.2 Parallelism in Rack Verification and Check-out 550
18.4.3 Parallelism in Inter-rack Cabling 550
18.4.4 Types of Teams and Specific Skills 551
18.5 Physical Assembly Conclusions 552
Chapter 19 Getting Your Cluster Wired: An Example Cable
Labeling Scheme 553
19.1 Defining The Cable Problem 554
19.2 Different Classes of Cabling 554
19.2.1 Intra-rack cables 555
19.2.2 Inter-rack cables 556
19.3 A First Pass At A Cable Labeling Scheme 556
19.3.1 Identifying Cable Connection Points 556
19.4 Refining the Cable Documentation Scheme 558
19.4.1 Labeling Cable Ends 558
19.4.2 Tracking And Documenting The Connections 560
19.5 Calculating the Work in Cable Installation 561
19.6 Minimizing Inter-rack Cabling 562
19.7 Cable Labeling System Summary 564
Chapter 20 Physical Constraints: Heat, Space, and Power 565
20.1 Identifying Physical Constraints for Your Cluster 565
20.2 Space, The Initial Frontier 566
20.3 Power Up Requirements 568
20.3.1 System Power Utilization 568
20.4 Taking the Heat 570
20.5 Physical Constraints Summary 573
Appendix A 575 Acronym List 575
Appendix B 581 List of URLs and Software Sources 581
Glossary 591
Bibliography 597
Index 599

Library of Congress Subject Headings for this publication:

Linux.
Operating systems (Computers).
Embedded computer systems -- Programming.