NGS_Bio_Africa

Next Generation Sequencing Bioinformatics Africa

Course Overview

Next generation sequencing (NGS) has become an essential tool in genetic and genomic analysis. It is increasingly important for experimental scientists to gain the bioinformatics skills required to analyse the large volumes of data produced by next generation sequencers. This course will equip participants with the essential informatics skills required to begin analysing NGS data and apply some of the most commonly used tools and resources for sequence data analysis.

The programme will cover prominent sequencing technologies, algorithmic theory and principles of bioinformatics, with a strong focus on practical computational sessions using sequence analysis techniques and tools applicable to any species or genome size. A variety of applications will be covered from post-sequencing analysis - QC, alignment, assembly, variant calling and RNA-Seq.

Requirements for course

  • The course virtual machine
  • Access to the Vula platform for video lectures and assignment submission portal
  • Attending the contact sessions twice a week (via zoom) for questions and practicals

Requirements for certification

  • Attend 80% of all contact sessions (in cases where it is impossible to do so you must inform the TAs for your classroom).
  • Submit 80% of practical assignments by the relevant hand-in dates.
  • Submit assessments by the relevant hand-in dates and obtain a minimum grade of 60% overall for the assessments (one assessment per Module).

Dates running

The current course runs between 22 March - 9 June 2022

NB - The module content on this git may not be up to date as the course runs, as it is based on the 2021 materials!

Instructors

Timetable

Overview

  • Module 1: Intro to Unix/Linux 29 March & 1 April 2022
  • Module 2: Introduction to NGS technologies 5 April 2022
  • Module 3: NGS data formats and QC 7 &12 April 2022
  • Module 4: Alignment to Reference 19 April 2022
  • Module 5: Variant Calling - Human 21 & 26 April 2022
  • Module 6: Variant Calling - Choose Pathogen/Human 3, 5 10 May 2022
  • Module 7: RNA-seq Human 12 May 2022
  • Module 8: RNA-seq Pathogen 17 May 2022
  • Module 9: Chip-seq 19 May 2022
  • Module 10: Genome Assembly 31 May & 2 June 2022

Detailed timetable

(link to follow)

Course manual

Module 1: Intro to Unix/Linux

Module 2 - Introduction to NGS technologies

Module 3 - NGS data formats and QC

Module 4 - Alignment to Reference

Module 5 - Variant Calling - Human

Module 6 - Variant Calling - Choose Pathogen or Human

Module 7 - RNA-seq Human - Human

Module 8 - RNA-seq Pathogen

Module 9 - Chip-seq

Module 10 - Genome Assembly

List of Software loaded onto the virtual machine:

  • bcftools
  • bedtools
  • igv
  • picard
  • bwa
  • breakdancer
  • lumpy-sv
  • minimap2
  • sniffles
  • hisat2
  • kallisto
  • r-sleuth
  • bowtie2
  • macs2
  • meme
  • ucsc-bedgraphtobigwig
  • ucsc-fetchchromsizes
  • assembly-stats
  • canu
  • kmer-jellyfish
  • seqtk
  • velvet
  • wtdbg
  • freebayes
  • gatk4
  • pysam
  • genomescope.R
  • samtools
  • fastqc
  • multiqc
  • trimmomatic
  • vcftools
  • iqtree
  • snpeff
  • snp-sites

R Modules

  • GenomicFeatures
  • DESeq2
  • tximport
  • Pheatmap

Software which must be added to the VM:

  • salmon

To add:

conda create -n salmon salmon
conda activate salmon 

Any reuse of the course materials, data or code is encouraged with due acknowledgement.


License

Creative Commons Licence
This work is licensed under a Creative Commons Attribution 4.0 International License.