Courses@CRG_course_nextflow_May_2022

Reproducible research and data analysis using Containers & Nextflow

About the course

This slow-paced hands-on course is designed for absolute beginners who want to start using Linux Containers (Docker and Singularity) and Nextflow to achieve reproducibility of data analysis.

Outline

The course will train participants to build Nextflow pipelines and run them with Linux containers.

It is designed to provide trainees with short and frequent hands-on sessions, while keeping theoretical sessions to a minimum.

The course will be fully virtual via the Zoom platform.

Learning objectives

  • Locate and fetch Docker/Singularity images from dedicated repositories.
  • Execute/Run a Docker/Singularity container from the command line.
  • Locate and fetch Nextflow pipelines from dedicated repositories.
  • Execute/Run a Nextflow pipeline.
  • Describe and explain Nextflow basic concepts.
  • Test and modify a Nextflow pipeline.
  • Implement short blocks of code into a Nextflow pipeline.
  • Develop a Nextflow pipeline from scratch.
  • Run a pipeline in diverse computational environments (local, HPC, cloud).
  • Share a pipeline.

Prerequisite / technical requirements

Being comfortable working with the CLI (command-line interface) in a Linux-based environment. Knowledge of containers is not mandatory. The course materials is online in the dedicated GitHub page for self-learning.

Practitioners will need to connect during the course to a remote server via the "ssh" protocotol. You can learn about it here

Those who follow the course should be able to use a command-line/screen-oriented text editor (such as nano or vi/vim, which are already available on the server) or to be able to use an editor able to connect remotely. For sake of information, below the basics of "nano": https://wiki.gentoo.org/wiki/Nano/Basics_Guide

Having a GitHub account is recommended.

Dates, time, location

  • Dates: 30th May until 3rd June 2022 and additionaly 7th of June 2022. Time: 9:30 - 13:30h (CET)

  • Location: virtual, via Zoom.

Program

Day 1: Introduction to Linux containers, Docker (May 30)

  • 09:30-11:00 Introduction to containers and Docker
  • 11:00-11:30 Break
  • 11:30-13:00 Docker

Day 2: Docker and Singularity (May 31)

  • 09:30-11:00 More advanced Docker
  • 11:00-11:30 Coffee break
  • 11:30-13:30 Singularity

Day 3: Understand and run a basic Nexflow pipeline (June 1)

  • 09:30-11:00 Introduction to Nextflow
  • 11:00-11:30 Coffee break
  • 11:30-13:30 Making simple scripts

Day 4: Write, modify and run a complex pipeline (June 2)

  • 09:30-11:00 Decoupling params, resources and main script
  • 11:00-11:30 Coffee break
  • 11:30-13:30 Using public pipelines

Day 5: Run a Nextflow pipeline in different environments, share and report (June 3)

  • 09:30-11:00 Profiles and cloud
  • 11:00-11:30 Coffee break
  • 11:30-13:30 Modules and Tower

Day 6: nf-core (June 7)

  • 09:30-10:30 Introduction to nf-core (TBC)
  • 10:30-11:00 nf-core for users I (TBC)
  • 11:00-11:30 Coffee break (TBC)
  • 11:30-12:30 nf-core for users II (TBC)
  • 12:30-13:30 nf-core for developers (TBC)

Acknowledgements

  • Sphinx. The publication system for our course pages.
  • ELIXIR Workshop Hackathon. Joined initiative with other colleagues to exchange materials for courses and approaches for courses like this.