/sarscov2barcode

Assign clades to SarsCov2 samples

Primary LanguagePython

sarscov2barcode

alt text

Prerequisites

Instructions

  1. Map the reads using bwa or bowtie to obtain bam files
  2. Run the bcftools command and pipe the vcf output to assignclade.py script
bcftools mpileup \
-f GCF_009858895.2_ASM985889v3_genomic.fna sample_1_sorted.bam \
| bcftools call --ploidy 1 -mv -Ov \
| python3 assignclade.py

Output


{241: 'T', 3037: 'T', 14408: 'T', 23403: 'G', 27904: 'C', 28854: 'T'}

Sample ['C' 'T' 'G' 'T' 'T' 'T' 'G' 'G' 'T' 'G' 'G']
Match  ['|' '|' '|' '|' '|' '|' '|' '|' '|' '|' '|']
Clade G['C' 'T' 'G' 'T' 'T' 'T' 'G' 'G' 'T' 'G' 'G']
Clade G matches 100%

Citation

Qingtian Guan, Mukhtar Sadykov, Sara Mfarrej, Sharif Hala, Raeece Naeem, Raushan Nugmanova, Awad Al-Omari, Samer Salih, Abbas Al Mutair, Michael J. Carr, William W. Hall, Stefan T. Arold,Arnab Pain.
The genomic variation landscape of globally-circulating clades of SARS-CoV-2 defines a genetic barcoding scheme doi: https://doi.org/10.1016/j.ijid.2020.08.052