- python3
- scikit-allel
- pandas
- numpy
- bcftools (http://www.htslib.org/download/)
- Map the reads using bwa or bowtie to obtain bam files
- Run the bcftools command and pipe the vcf output to assignclade.py script
bcftools mpileup \
-f GCF_009858895.2_ASM985889v3_genomic.fna sample_1_sorted.bam \
| bcftools call --ploidy 1 -mv -Ov \
| python3 assignclade.py
{241: 'T', 3037: 'T', 14408: 'T', 23403: 'G', 27904: 'C', 28854: 'T'}
Sample ['C' 'T' 'G' 'T' 'T' 'T' 'G' 'G' 'T' 'G' 'G']
Match ['|' '|' '|' '|' '|' '|' '|' '|' '|' '|' '|']
Clade G['C' 'T' 'G' 'T' 'T' 'T' 'G' 'G' 'T' 'G' 'G']
Clade G matches 100%
Qingtian Guan, Mukhtar Sadykov, Sara Mfarrej, Sharif Hala, Raeece Naeem, Raushan Nugmanova, Awad Al-Omari, Samer Salih, Abbas Al Mutair, Michael J. Carr, William W. Hall, Stefan T. Arold,Arnab Pain.
The genomic variation landscape of globally-circulating clades of SARS-CoV-2 defines a genetic barcoding scheme
doi: https://doi.org/10.1016/j.ijid.2020.08.052