System Basics [2017-09-14 Thu] (Slides)


  • what is High Performance Computing? Large…
    • # of processors
    • # of runs
    • memory requirements (finance, bio)
    • storage reqs (ML, big data)
    • runtime (optimization, poor coding)
  • HPC does not improve single processes!
  • resources
    • internal
      • ICS-ACI
    • external
      • NSF XSEDE
      • Blue Waters through GLCPC
      • DoE
  • people
    • engagement team (Lavely, Blanton, Pavloski)
    • iAsk center (help answering questions, mostly on ACI)
  • goal: enable research


  • 24000 cores
  • very capable tool
    • ACI-B for batch (high/std/basic memory)
    • ACI-I for interactive (Rstudio)
    • Gateways, GPUs (coming soon)
    • ACI Partitions (CyberLAMP)
    • Hosted resources (LIGO)
  • ACI-B
    • access via ssh only
    • log into an aci-lgn-* node to submit jobs
    • process runs have dedicated resources when they start running
    • what you request is specified in the submission script
    • anybody can access the open allocation w/ some limitations
    • slow X session launchable via -Y
    • used for: production runs, high mem/proc/time
  • ACI-I
    • access via ssh or EoD
    • placed on a node and run process there
    • limited to 4 procs, 12h, 48GB mem
    • does not guarantee exclusive resources like a compute node
    • used for: debugging, visualization, interaction

File systems

  • three storage location options
    • home
      • /storage/home/userid
      • small (10GB) but well protected
        • not shared w/ anybody else
        • backed up frequently
    • work
      • /gpfs/work/userid
      • 128GB cap
      • you can share data
      • backed up every 3 days or so
    • scratch
      • /gpfs/scratch/userid
      • no file size limit
      • 1M files cap per user
      • files older than 30 days are deleted
  • work & scratch are linked from home
  • research groups can purchase
    • allocations
    • storage space
      • for
        • group (like work but 5TB)
        • archive
      • datamgr.aci.ics.psu
        • allow for faster data transfers


  • ways to connect
    • ssh in terminal
      # or
      • password doesn’t show up!
    • PuTTy on Windows
    • Exceed onDemand to ACI-I tutorial
      • config
        • Xconfig: Desktop_mode_1280x1024.cfg
        • Xstart: Gnome_Desktop.xs
      • only 1 session
      • you can always come back (don’t log out, just quit)
      • still gonna have to use cmdline
        • Apps -> System Tools -> Terminal


  • things to know
    • GIYF
    • man
    • banana
  • find total and available allocation hours
    mam-list-funds -h
  • manual for commands
    man cmd
  • see list of options using an improper flag
    mam-list-funds --banana
  • 4 most basic commands
    ls                            # list contents of curr dir
    pwd                           # print current dir
    cd scratch                    # change dir
    cp logFile logFile_13Sept2017 # copy file
  • other useful commands
    history # past commands
    mv # move files
    rm # remove files
    mkdir # make
    find # find files
    grep # filter files
    awk # text manipulator
    du # disk space
    clear # clear screen
  • special characters
    cd ~ # move to home
    cd . # move to here (stay here)
    cd .. # move one dir up
    ls *.png # list all png
    ls -1 | grep png # pipe output of ls to other commands
    ls > # put output in a file


  • wrapper for individual program
    • e.g. in order to use Matlab you need to first load its module
  • show modules currently available
    module avail
  • search for modules
    module spider vasp
  • load a module optionally w/ specific version (otherwise will you the default)
    module load ansys/18.1

    better to specify version.

  • module families
    module avail
    module load gcc/5.3.1
    module avail # new modules (compiled with hence conditional on gcc/5.3.1) will show up
  • other cmds
    module list # list loaded modules
    module purge # clean up loaded modules
    module show modName # where libs of the module are, which env vars are setup
    • e.g. w/ boost module you can use show to help you set up lib paths when compiling w/ it

Transfer data

  • cmdline
    scp lFile
    rsync ...
    sftp ...
  • programs: WinSCP, Filezilla
  • via Box, Dropbox w/ EoD’s Firefox interface (no syncing)
    • (and no sound => no youtube!)
  • specific programs: Globus, Aspera

Submitting a Job

  • submission scripts
    • two sections
      1. PBS directives
        #PBS directive ...
        • used for requesting resources
        • only at beginning
      2. commands
    • to submit
      qsub submitScript.pbs
    • e.g.
      #PBS -l nodes=1:ppn=1
      #PBS -l walltime=5:00
      #PBS -A open
      echo "Job started on $(hostname) at $(date)"
      module purge
      module load matlab/R2016a
      # goto dir where the script lives
      cd $PBS_O_WORKDIR
      matlab-bin -nodisplay -nosplash < runThis.m > log.matlabRun
      echo "Job ended at $(date)"


  • ICS docs
  • iAsk
    • iAsk at, 54275
  • check doc of other batch systems: TACC, OSC
  • seminar 2
    • submitting jobs
    • compiling simple codes
    • allocation usage
    • intro to parallelization
      • distributed vs shared memory
    • data moving
      • globus, rsync
  • seminar 3 (Feb 2018)
    • optimization techniques