This is an interactive automatic scripts for creating users, it will do the following:
- Creating user using useradd
- Set a random password for this user
- Set local quota (Default 1Gb, can be modified in header)
- Synchronize /etc/group and /etc/passwd across executing hosts and storage hosts
- Set lfs quota (Default 5Tb, can be modified in header)
- Change the permission for user home directory on shared filesystem (/lustre/home in our case. Default home is controlled by HOME variable in /etc/default/useradd).
- Print user and password as well as a composed message ready for email use.
bash add_user.sh
In our case, cn01 cn02 cn03 cn04 fat01 is executing nodes. Node(10.1.1.100) is MGT. Node(10.1.1.101) and Node(10.1.1.102) are OST nodes. Shutdown sequences is first MGT then OST. And power-on sequence is also first MGT then OST. wiki
The job manager we used is SGE which is bundled with ROCKS7.
Main space of local storage are allocated to /state/partition1 on each node.
bash shutdown_clustre/1.shutdown_sge.sh
umount lustre filesystem from all nodes.
shutdown_clustre/2.stop_lustre.sh
ssh to each node and execute shutdown command.
When shutdown Storage nodes, shutdown mgt first, then OST.
shutdown_clustre/3.shutdown.sh
Before shutdown login node, make sure lustre and rest nodes is off-line.
shutdown now
Press the power button of MGT (first) and OST nodes. Then the rest nodes.
Normally /mgt and /ost will be automounted. If there is a failure, you need to login to those nodes and mount manually.
The belowing scripts will remount lustre file system and enabling all SGE queues.
bash start_lustre.sh
bash disable_ipv6.sh
This script calls monitor.pl every minutes. If memory exceeds
90% of total memory, it will send a warning to all users with
wall
command. Therefore, you need to have root or sudoer privilege
to run this script. I tried to use cron but failed due to SGE
environmental variable dependency.
nohup bash monitor.sh &
set_mysql_tmp_dir.sh
Some software like mysql used file lock which is not supported by lustre by default. Therefore, I used localflock when mounting lustre filesystem.
Default space for /tmp and /var is very low. So I mount two directory from local space to those two system directories.
$cat /etc/fstab
172.1.1.100@tcp:/lustre /lustre lustre defaults,_netdev,localflock 0 0
/state/partition1/tmp /tmp none defaults,bind 0 0
/state/partition1/var /var none defaults,bind 0 0
## or directly use mount command
mount --bind /state/partition1/tmp /tmp