H3N2L194P: A Python repository from wchnicholas

H3N2 egg-adaptive substitution L194P

This analysis is adapted from McWhite et al. 2016

Fasta/pdmH1N1_All.fa: 2009 pandemic H1N1 (swine flu) HA sequences downloaded from GISAID
Fasta/HumanH3N2_All.fa: Human H3N2 HA sequences downloaded from GISAID
- Since there is a limit on the number of sequences being downloaded at once on GISAID, sequences for this project was first downloaded separately based on continent. Then sequences from different continent were combined to a single Fasta file.
Fasta/Bris07_fromNCBI.fa: 11 Bris07 sequences from the NCBI protein database were obtained by searching "A/Brisbane/10/2007", hemagglutinin.
Fasta/HK14_fromGISAID.fa: 8 HK14 sequences from GISAID were obtained by "A/Hong Kong/4801/2014".
Fasta/Sing16_fromGISAID.fa: 4 Sing16 sequences from GISAID were obtained by searching "A/Singapore/INFIMH-16-0019/2016".

Plot the frequency of different amino acids observed at residue 194 in different year

Rscript script/Plot_ProVsPSG.R
- Input file:
  - result/HumanH3N2_PSG.tsv
- Output file:
  - graph/HumanH3N2_ProVsPSG.png

python script/ParseNCBIseq.py
- Input file:
  - Fasta/Bris07_fromNCBI.aln
- Standard Output