Reproducible research and data analysis using Nextflow pipelines
Course page: https://www.sib.swiss/training/course/20211115_NEXTF
Course materials: https://biocorecrg.github.io/SIB_course_nextflow_Nov_2021/docs/
This slow-paced hands-on course is designed for absolute beginners who want to start using Nextflow to achieve reproducibility of data analysis.
The 4-half-day Nextflow course will train participants to build Nextflow pipelines and run them with Singularity containers.
It is designed to provide trainees with short and frequent hands-on sessions, while keeping theoretical sessions to a minimum.
The course will be fully virtual via the Zoom platform.
Trainees will work in a dedicated AWS environment.
- Locate and fetch Docker/Singularity images from dedicated repositories.
- Execute/Run a Docker/Singularity container from the command line.
- Locate and fetch Nextflow pipelines from dedicated repositories.
- Execute/Run a Nextflow pipeline.
- Describe and explain Nextflow basic concepts.
- Test and modify a Nextflow pipeline.
- Implement short blocks of code into a Nextflow pipeline.
- Develop a Nextflow pipeline from scratch.
- Run a pipeline in diverse computational environments (local, HPC, cloud).
- Share a pipeline.
Being comfortable working with the CLI (command-line interface) in a Linux-based environment. Knowledge of containers is not mandatory; however, this SIB course (https://www.sib.swiss/training/course/20211014_DOCK) can be advised to take. The course materials is online in the dedicated GitHub page for self-learning.
Practitioners will need to connect during the course to a remote server via the "ssh" protocotol. You can learn about it here
Those who follow the course should be able to use a command-line/screen-oriented text editor (such as nano or vi/vim, which are already available on the server) or to be able to use an editor able to connect remotely. For sake of information, below the basics of "nano": https://wiki.gentoo.org/wiki/Nano/Basics_Guide
Having a GitHub account is recommended.
-
Dates: November 15-18, 2021
-
Time: 13:00-17:30 (CET)
- Afternoon coffee break: 15:00-15:30
-
Location: virtual, via Zoom.
- 13:00-13:30 Getting started, become familiar with the working environment.
- 13:30-14:30 Talk by Cédric Notredame and Jose Antonio Espinosa from CRG.
- 14:30-15:00 Introduction to Docker and Singularity containers.
- 15:00-15:30 Break.
- 15:30-16:30 Docker hub, BioContainers and other repositories. Find existing containers. Execute a Singularity container.
- 16:30-17:30 Nexflow: introduction, installation, run a simple pipeline.
- 13:00-14:00 Nexflow basic concepts. Channels and Operators. Processes, Workflows and the log
- 14:00-14:30 Breakout rooms: create channels starting from data. Write and run more simple pipelines.
- 14:30-15:00 Troubleshooting, Q&A.
- 15:00-15:30 Break.
- 15:30-16:30 Combine processes, directives, resume the pipeline.
- 16:30-17:00 Breakout rooms: write and run a more complex pipeline.
- 17:00-17:30 Troubleshooting, Q&A.
- 13:00-14:00 Decoupling resources, parameters and nextflow script (params and nextflow config). Help section. How to get pipelines and run them.
- 14:00-14:30 Breakout rooms: example of directives, get a pipeline and run it with a test dataset.
- 14:30-15:00 Troubleshooting, Q&A.
- 15:00-15:30 Break.
- 15:30-16:30 Fetch public pipelines, adapt and run them.
- 16:30-17:00 Breakout rooms: exercise.
- 17:00-17:30 Troubleshooting, Q&A.
- 13:00-14:00 Using Singularity, profiles and deploying on AWS.
- 14:00-14:30 Breakout rooms: Deploy your pipeline on the cloud.
- 14:30-15:00 Troubleshooting, Q&A.
- 15:00-15:30 Break.
- 15:30-16:30 Modules, reporting, share Nextflow pipelines on github. Nextflow Tower.
- 16:30-17:00 Breakout rooms: exercise.
- 17:00-17:30 Troubleshooting, Q&A.
- Sphinx. The publication system for our course pages.
- ELIXIR Workshop Hackathon. Joined initiative with other colleagues to exchange materials for courses and approaches for courses like this.