Next generation sequencing (NGS) has become an essential tool in genetic and genomic analysis. It is increasingly important for experimental scientists to gain the bioinformatics skills required to analyse the large volumes of data produced by next generation sequencers. This course will equip participants with the essential informatics skills required to begin analysing NGS data and apply some of the most commonly used tools and resources for sequence data analysis.
The programme will cover prominent sequencing technologies, algorithmic theory and principles of bioinformatics, with a strong focus on practical computational sessions using sequence analysis techniques and tools applicable to any species or genome size. A variety of applications will be covered from post-sequencing analysis - QC, alignment, assembly, variant calling and RNA-Seq.
- The course virtual machine
- Access to the Vula platform for video lectures and assignment submission portal
- Attending the contact sessions twice a week (via zoom) for questions and practicals
- Attend 80% of all contact sessions (in cases where it is impossible to do so you must inform the TAs for your classroom).
- Submit 80% of practical assignments by the relevant hand-in dates.
- Submit assessments by the relevant hand-in dates and obtain a minimum grade of 60% overall for the assessments (one assessment per Module).
The current course runs between 22 March - 9 June 2022
NB - The module content on this git may not be up to date as the course runs, as it is based on the 2021 materials!
- Sumir Panji, University of Cape Town, South Africa
- Amel Ghouila,Bill and Melinda Gates Foundation, USA
- Narendar Kumar, Wellcome Sanger Institute, UK
- Fatma Guerfali, Institut Pasteur de Tunis, Tunisia
- Shaun Aron, University of the Witwatersrand, South Africa
- Gerrit Botha, University of Cape Town, South Africa
- Petr Danecek, Wellcome Sanger Institute, UK
- Eugene Gardner, Wellcome Sanger Institute, UK
- Jon Ambler, University of Cape Town, South Africa
- Nyasha Chambwe, St Jude Children's Research Hospital, USA
- Phelelani Mpangase, University of the Witwatersrand, South Africa
- Vivek Iyer, University of Cape Town, South Africa
- Module 1: Intro to Unix/Linux 29 March & 1 April 2022
- Module 2: Introduction to NGS technologies 5 April 2022
- Module 3: NGS data formats and QC 7 &12 April 2022
- Module 4: Alignment to Reference 19 April 2022
- Module 5: Variant Calling - Human 21 & 26 April 2022
- Module 6: Variant Calling - Choose Pathogen/Human 3, 5 10 May 2022
- Module 7: RNA-seq Human 12 May 2022
- Module 8: RNA-seq Pathogen 17 May 2022
- Module 9: Chip-seq 19 May 2022
- Module 10: Genome Assembly 31 May & 2 June 2022
(link to follow)
Module 1: Intro to Unix/Linux
Module 2 - Introduction to NGS technologies
- Lecture Part 1 PDF version
- Lecture Part 2 PDF Version
- Lecture Part 3 PDF version
- Lecture Part 4 PDF version
Module 3 - NGS data formats and QC
Module 4 - Alignment to Reference
Module 5 - Variant Calling - Human
Module 6 - Variant Calling - Choose Pathogen or Human
- Day 1 Practical Manual Online version
- Day 1 Practical Worksheet Online version
- Day 2 Practical Manual Online version
- Day 2 Practical Worksheet Online version
Module 7 - RNA-seq Human - Human
Module 8 - RNA-seq Pathogen
Module 9 - Chip-seq
Module 10 - Genome Assembly
- bcftools
- bedtools
- igv
- picard
- bwa
- breakdancer
- lumpy-sv
- minimap2
- sniffles
- hisat2
- kallisto
- r-sleuth
- bowtie2
- macs2
- meme
- ucsc-bedgraphtobigwig
- ucsc-fetchchromsizes
- assembly-stats
- canu
- kmer-jellyfish
- seqtk
- velvet
- wtdbg
- freebayes
- gatk4
- pysam
- genomescope.R
- samtools
- fastqc
- multiqc
- trimmomatic
- vcftools
- iqtree
- snpeff
- snp-sites
- GenomicFeatures
- DESeq2
- tximport
- Pheatmap
- salmon
conda create -n salmon salmon
conda activate salmon
Any reuse of the course materials, data or code is encouraged with due acknowledgement.
This work is licensed under a Creative Commons Attribution 4.0 International License.