/btools

A series of scripts to automate the execution of commands within a cluster.

Primary LanguageShell

btools

A series of scripts to automate the execution of commands within a cluster.

Adapted from BYOC (Vance, Poublon, & Polik, 2016) for use with BYOC++.

Conventions

  • The head node is identified as node01
  • The compute nodes are identified as node02 and node03

Installation

  1. On all of the nodes, install and configure rsh

    a. Install the required packages

    $ yum install -y rsh rsh-server

    b. In /etc/securetty, add the following lines

    rsh
    rexec
    rlogin

    c. In /root/.rhosts, add the following lines

    node01 root
    node02 root
    node03 root

    d. In /etc/hosts.equiv, add the following lines

    node01
    node02
    node03

    e. Enable and start the sockets

    $ systemctl enable rsh.socket
    $ systemctl enable rexec.socket
    $ systemctl enable rlogin.socket
    $ systemctl start rsh.socket
    $ systemctl start rexec.socket
    $ systemctl start rlogin.socket

    f. Disable SELinux by changing SELINUX=enforcing in /etc/sysconfig/selinux

    SELINUX=disabled

    g. Reboot all of the nodes

    $ init 6

  2. On the head node, run btools to create all of the scripts on your machine

    $ ./btools

  3. Add the hostnames of your compute nodes to /usr/local/sbin/bhosts

    node02
    node03

  4. Enable passwordless rsh to remove password prompts for the bsync command

    a. Generate a public/private rsa key pair; leave all of the prompts blank and hit enter for each

    $ ssh-keygen
    Generating public/private rsa key pair.
    Enter file in which to save the key (/root/.ssh/id_rsa):
    Enter passphrase (empty for no passphrase):
    Enter same passphrase again:
    Your identification has been saved in /root/.ssh/id_rsa.
    Your public key has been saved in /root/.ssh/id_rsa.pub.

    b. Copy the public ssh key to all of the compute nodes and enter the passwords for each machine when prompted

    $ ssh-copy-id -i /root/.ssh/id_rsa.pub node02
    $ ssh-copy-id -i /root/.ssh/id_rsa.pub node03

Description

bhosts is a file that contains the hostnames of all of the compute nodes. For example:

node02
node03
node04

brsh loops through all of the hostnames in bhosts, executing rsh <command> for each.

bexec executes a command over the hostnames in bhosts, but it does not wait for each node to finish like brsh does. bexec starts the command on each node and then waits for them to finish; additionally, it retrieves and displays the logs for the operations. If Slurm is set up on the cluster, bexec will check will status of the compute nodes through Slurm.

bpush bpush copies a file to all of the hostnames in bhosts. Similarly to bexec, it executes the command across the nodes simultaneously, while also retrieving and displaying the logs

bfiles bfiles is a file that contains a list of files to be copied to all of the compute nodes. It contains:

/etc/passwd/
/etc/group/
/etc/shadow/
/etc/gshadow/

bsync bsync copies the files defined in bfiles to all of the compute nodes. Thus, the users on the head node will exist on all of the compute nodes.

References

Nathan R. Vance, Michael L. Poublon and William F. Polik, "BYOC: Build Your Own Cluster, Part III - Configuration",
        Linux Journal, July 2016, Issue 279, 70-98.