A series of scripts to automate the execution of commands within a cluster.
Adapted from BYOC (Vance, Poublon, & Polik, 2016) for use with BYOC++.
- The head node is identified as
node01
- The compute nodes are identified as
node02
andnode03
-
On all of the nodes, install and configure
rsh
a. Install the required packages
$ yum install -y rsh rsh-server
b. In
/etc/securetty
, add the following lines
rsh
rexec
rlogin
c. In
/root/.rhosts
, add the following lines
node01 root
node02 root
node03 root
d. In
/etc/hosts.equiv
, add the following lines
node01
node02
node03
e. Enable and start the sockets
$ systemctl enable rsh.socket
$ systemctl enable rexec.socket
$ systemctl enable rlogin.socket
$ systemctl start rsh.socket
$ systemctl start rexec.socket
$ systemctl start rlogin.socket
f. Disable SELinux by changing
SELINUX=enforcing
in/etc/sysconfig/selinux
SELINUX=disabled
g. Reboot all of the nodes
$ init 6
-
On the head node, run
btools
to create all of the scripts on your machine
$ ./btools
-
Add the hostnames of your compute nodes to
/usr/local/sbin/bhosts
node02
node03
-
Enable passwordless rsh to remove password prompts for the
bsync
commanda. Generate a public/private rsa key pair; leave all of the prompts blank and hit enter for each
$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
b. Copy the public ssh key to all of the compute nodes and enter the passwords for each machine when prompted
$ ssh-copy-id -i /root/.ssh/id_rsa.pub node02
$ ssh-copy-id -i /root/.ssh/id_rsa.pub node03
bhosts is a file that contains the hostnames of all of the compute nodes. For example:
node02
node03
node04
brsh loops through all of the hostnames in bhosts
, executing rsh <command>
for each.
bexec executes a command over the hostnames in bhosts
, but it does not wait for each node to finish like brsh
does. bexec
starts the command on each node and then waits for them to finish; additionally, it retrieves and displays the logs for the operations. If Slurm is set up on the cluster, bexec
will check will status of the compute nodes through Slurm.
bpush
bpush
copies a file to all of the hostnames in bhosts
. Similarly to bexec
, it executes the command across the nodes simultaneously, while also retrieving and displaying the logs
bfiles
bfiles
is a file that contains a list of files to be copied to all of the compute nodes. It contains:
/etc/passwd/
/etc/group/
/etc/shadow/
/etc/gshadow/
bsync
bsync
copies the files defined in bfiles
to all of the compute nodes. Thus, the users on the head node will exist on all of the compute nodes.
Nathan R. Vance, Michael L. Poublon and William F. Polik, "BYOC: Build Your Own Cluster, Part III - Configuration",
Linux Journal, July 2016, Issue 279, 70-98.