
This Repo is meant to do a trial phylogenetics analysis using virus data downloaded from Genebank.


In the folder data there are addittional fasta files as well a new folder having commandline code used to process the files


Folder data has addictional two fasta files, one having complete sequnce of HA gene and fina HA sequences having all the sequences that surpases 60% of the complete sequence threshold

Notebook shows a simple python code used to eliminate short sequences.


Folder Results has a tree drawn using Mega software(Maximum Likelihood Method) , MSA used was MUSCLE

comment below Mike Javan

  • ....