Cluster administration
With large clusters, it is common to have a dedicated master node
that is the only machine connected to the outside world. This machine
then acts as the file server, and the compile node. This provides
a single-system image to the user, who launches the jobs from the master
node without ever logging into any nodes.
There are boot disks available that can help in setting up the
individual nodes of a cluster. Once the master is configured, these
boot disks can be configured to perform a complete system installation
for each node over the network.
Most cluster administrators also develop other utilities, like
scripts that operate on every node in the cluster. The rdist
utility can also be very helpful.
If you purchase a cluster from a vendor, it should come with
software installed to make it easy to use and maintain the system.
If you build your own system, there are some software packages
available to do the same.
OSCAR is a
fully integrated software bundle designed to make it easy to
build a cluster.
Scyld Beowulf
is a commercial package that enhances the Linux kernel providing system tools
that produce a cluster with a single system image.
If set up properly, a cluster can be relatively easy to maintain.
The operations that you would normally do on a single machine simply
need to be replicated across many machines. If you have a very large
cluster, you should keep a few spare machines to make it easy to recover
from hardware problems.
Links to more advanced topics
Ames Laboratory |
Condensed Matter Physics |
Disclaimer |
ISU Physics