Notre Dame CRC Wiki

This wiki is organized using 2 levels of increasing complexity. This page details the need to know. The advanced wiki has a much more in-depth discussion of what is required. The purpose of this wiki is to explain to users specifically in Quant Psych how they can use the CRC to further their research using distributed and parallel computing. The programming language of choice will be R, but the techniques described within are expected to be broadly applicable across technologies. The crc utilizes Univa Grid Engine as it’s batch queing system, and this wiki is a dedicated resource to learning how to operate within that environment.

And of course, there is a HELP! I have one day to make this work!

Setting up your environment

Window

I feel sorry for you.

Mac

See Linux

Linux

Official stance of Notre Dame on Linux Users:

./img/draper.gif

However, in practice, using the CRC as a Linux user is in fact easier than all other platforms, if only because you are already used to the command line.

Getting familiar with the command line

Transfering files to and from the crc

You can use the command scp (secure copy protocol) to send files to the crc

How to transfer a single file:

scp ~/PATH/TO/FILE/file.txt user@fontendmachine.crc.nd.edu:~/PATH/TO/DIR/

How to transfer a folder:

scp -r ~/PATH/TO/DIR user@fontendmachine.crc.nd.edu:~/PATH/TO/DIR/

In both instances you will be prompted for your password.

Editing files on the CRC

Eventually, you will have to edit a file on the crc. This means using a terminal editor of some sort. There are three common editors people use to edit files on linux servers.

nano (User Friendly)

Nano is by far the most simple and easy to use editor.

Tutorial here

emacs (Very Powerful)

Emacs is an editor with a very long history.

vim (The Devil)

Useful aliases

An alias is a shortcut on the command like that you can use run common tasks much more quickly.

Add the following to your .cshrc file and see what they do!

#Additional aliases
alias ls 'ls --color=auto'
alias ll 'ls -l'
alias llh 'du -h | sort -n'
alias qwjobs 'qstat -u username | grep qw | wc -l'
alias qrjobs 'qstat -u username | grep "username" | grep "r" | wc -l'
alias qst 'qstat -u username'
alias qqt 'qstat | grep --color -E "username|*"'
alias scr 'cd /scratch365/username/'
alias lastrun 'qstat | grep -E "r\s+[0-9]+\/[0-9]+\/[0-9]+.+long" --color=auto | tail -n 1'
alias cf 'ls -1 | wc -l'

ssh hints and tricks

Queues

debug (short) Queue

long queue

Open Cores

On the long Queue

free_nodes.sh @crc_d12chas

On the debug Queue

free_nodes.sh @debug_d12chas

Using multiple softwares

GNU Parallel

interactive jobs

Starting an interactive session:

qrsh -smp pe 8 -now n -N JOBNAME

screen

Screen is a handy tool installed on linux that allows your sessions to persist in the event that your internet connection disconnects. If you were using an interactive job, this would normally make you lose all progress you have made thus far. However, with screen, you can resume this easily.

for loops

Writing a for loop in bash is super handy.

Example

Say you ran a simulation with 20 thousand conditions and because you failed to do proper error handling, you now suddenly have a list of task ids that need to be re-run.

The -t flag in a submit script cannot handle not integer sequences of task ids.

Solution: Write a bash script with a for loop to loop through the task ids you need to repeat.

First, write a bash script, such as follows:

for i in 15 27 36 98 752 # ids you need to replicate
do
    qsub -t $i submit.job
done

Then, make it executable:

chmod +x bash-for-loop.sh

Then, run it:

./bash-for-loop.sh

Ids stored in a file

Let’s say the list is too long for the ids to be placed manually into the bash script. Instead, you can store the ids in a file and pass that as an argument to the for loop.

for i in $(cat $1)
do
    qsub -t $i submit.job
done

Then, make it executable:

chmod +x bash-for-loop-args.sh

Then, run it with the argument ids.txt:

./bash-for-loop-args.sh ids.txt