
Skaggs building, 114.


If you want to follow along, please bring a laptop with R (first day) and a SSH client. Software for the second day will be discussed at the end of the first day.

Day 1

Time Presenter Topic
8.30 Tiago Introduction and DBS Scientific computing architecture
9.00 Brice Selected Unix topics: user profiles, persistent shells, parallelization, evaluating available resources, submission to the SRA, etc.
11.00   Break
11.15 Travis Efficient computing, and how terminal commands can simplify your life
01.00   Break
02.00 Brice Selected R topics: Installing to local repositories, functions, environments, parallelization, condition handling, etc.
03.45 Christian R data.table: Big Fast Data
04.30   Break
04.45 Tiago Introducing the Python environment (Jupyter lab, matplotlib, Python3, conda)
05.30   End

Day 2

Time Presenter Topic
08.30 Brice R: Wrap up and case study examples
09.15 Tiago Data preprocessing (HDF5, VCF)
11.00   Break
11.15 Tiago Parallel processing with Dask, subsampling
01.00   Break
02.45 Doug Montana code
03.00   Break
03.15 Tiago Galaxy programmatic interface
05.30 End  


  • R
    • The R Inferno: Fun and Informative. "If you are using R and you think you’re in hell, this is a map for you."
    • Google R style guide: Write cleaner code!
    • CRAN FAQs: Details, including OSX, Windows.
    • Advanced R: Superior R learning material from Hadley Wickham.