Build your own Linux Cluster
Warning: page not complete.. I'm still working on it!
I only used open-source software to build the cluster. So everything
has been downloaded from the web. Disclaimer: I did this back in 2006,
so versions will have changed, and procedure might also have become simplified.
The aim of this page is only to give a general idea!
Essentially any linux box with reasonable amount of RAM and hard disk would do.
To make life easier, it is best if one uses standard hardware and if all the
compute nodes are identical. Though this is not a necessity, heterogeneous
clusters are very much a reality.
That's it! The core software is TORQUE, which is a open source fork of the PBS.
Head Node Configuration:
Step 0: Assuming the head node is installed with the LINUX operating system.
Step 1: Start NIS service.
Step 2: Start NFS service.
Step 3: Download TORQUE (tar.gz) file, and uninstall into /usr/local/
tar -xvzf torque-2.1.2.tar.gz
#Add node names one per line, for example:
#To start PBS server, execute the command:
pbs_server #starts server
#Other PBS commands
qterm -t quick #shuts down server
pbs_mom #starts pbs client daemon.
pbs_sched #Starts native Scheduler. NOTE this is not required if
you are going to use MAUI.
pbsnodes -a # check node status
Client Node Configuration:
STEP 0: Make the node a NIS and NFS client of the Head node.
STEP 1: Copy these files "torque-package-mom-linux-i686.sh" and
"torque-package-clients-linux-i686.sh" to the clients.
#Add the following two lines
$usecp *:/home1 /home1
#here /home1 is the common NFS directory, exported by the headnode.
# The user home directories
sit in this partition.
Make sure the client host name and the server host names are given
in the /etc/hosts file.
STEP 5: Start the PBS client daemon
In order to make full use of the cluster for distributed computing, you
need to install OpenMPI.
./configure CC=gcc CXX=g++ F77=g77 FC=gfortran --with-tm=/var/spool/torque --disable-shared --enable-static
The following line needs to be added to the user .bashrc
./configure --disable-shared --enable-static
SUBMISSION of JOBS:
In order to submit a job to the cluster, you have to make use of a "batch script". Essentially , this is a file containing the details of your job, how many
nodes it requires, how much memory, how long it will take, etc.. A sample script can be found here.
Essentially, that's it!!
The part that I have skipped is the automated client
installation. That means you install one "golden" client and using some
software (which I don't deal with here) copy it completely into the other
clients. It can be pretty tricky. So if you have only a few clients, and
just want to get a feel for the clustering, then the shortcut method is to
manually install all the clients!
Back to my Linux Page