/covariants

Scripts to investigate the SUKS cluster

Primary LanguagePython

CoVariants: SARS-CoV-2 Mutations and Variants of Interest

Emma B. Hodcroft1

1Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland

Please cite and link back to this site if you use this resource - Thank you!

This repository is intended to provide an overview (not necessarily complete) of SARS-CoV-2 mutations that are of interest. It should be noted that these mutations are primarily of interest due to spread in Europe: this is simply a reflection that the primary maintainer/author (Emma Hodcroft) works mostly with European data.

The code used to generate these tables, graphs, and the sequences related to that mutation can be found in this repository.

The SARS-CoV-2 pandemic & research surrounding it is ongoing. I will make every effort to try to keep this repository up-to-date, but readers should take care to double-check that the information is the latest available.
I welcome PR requests to this repository providing new links and information! The more detail you can include in a pull request (PR) the faster I'll be able to review it. If possible, provide a PR that adds/edits the appropriate links/etc, and I can merge it faster - if you can't do that, making an issue is fine, but I might be slower incorporating it.

Mutations

Overview of all mutation tables & graphs

Overview of all mutation country plots

Index

Clusters/mutations are listed below by the location of a mutation in the spike protein (S:) - the letter after : indicates the original amino-acid, the number the position in the spike protein, and the last letter, the 'new' amino-acid.
As S:N501 has multiple amino-acid mutations, there is no second letter.
20A.EU1 and 20A.EU2, because of their prominence, have been given 'subclade' names. The mutation is listed in parentheses after the name.

20A.EU1 (S:A222V)

Figure of S:A222V

Figure made via GISAID

Dedicated 20A.EU1 Nextstrain build

Table and charts of mutation distribution

  • Defining mutations:
    • Nonsynonymous: S:A222V; ORF10:V30L; N:A220V or ORF14:L67F (overlapping reading frame with N)
    • Synonymous: T445C, C6286T, C26801G
  • S:A222V
    • Mutation in the non-terminal domain (NTD), which is not known to play a direct role in receptor binding or membrane fusion
    • Associated with a cluster that initially expanded in Spain and spread across Europe via holiday travel (see Hodcroft et al preprint)

20A.EU2 (S:S477N)

Figure of S:S477N

Figure made via GISAID

Dedicated 20A.EU2 Nextstrain build

Table and charts of mutation distribution

  • Note this cluster is only the European appearance of S:477N
  • Defining mutations:
    • Nonsynonymous: S:S477N; N:M234I, A376T; ORF1b:A176S, V767L, K1141R, E1184D
    • Synonymous: C4543T, G5629T, C11497T, T26876C
  • S:S477N

S:N501

Figure of S:N501

Figure made via GISAID

Dedicated S:N501 Nextstrain build

Table and charts of mutation distribution

  • Defining mutations:
    • Has appeared multiple times independently: each can be associated with different accompanying mutations
    • Amino-acid changes are N501Y (nucleotide mutation A23063T), N501T (nucleotide mutation A23064C), and N501S (nucleotide mutation A23064G)
  • S:N501
    • Mutation is in the receptor binding domain (RDB), important to ACE2 binding and antibody recognition
    • N501Y is associated with recently reported 'new variants' in the UK and South Africa:
      • '20B/501Y.V1' (B.1.1.7) was announced in the South East of England on 14 Dec 2020 (COG-UK Report, Rambaut et al., PHE report, PHE Technical Report 2, PHE Technical Report 3)
        • This particular variant is associated with multiple mutations in Spike, including: N501Y, a deletion at 69/70 (as seen in S:N439K & S:Y453F) (Kemp et al. bioRxiv (21 Dec)), Y144 deletion, and P681H (adjacent to the furin cleavage site).
        • There is also a notable truncation of ORF8, with Q27* (becomes a stop codon) (deletion of ORF8 was previously associated with reduced clinical severity (Young et al. Lancet)), and mutations in N: N:D3L and S235F.
      • '20C/501Y.V2' (B.1.351) is found in South Africa and was also announced in December 2020 (Tegally et al., medRxiv)
        • This variant is associated with multiple mutations in Spike, including: N501Y, K417N, and D80A.
        • There is also an N mutation: T205I.
        • It does not have the deletion at 69/70.
    • Smaller clusters also seen in Wales, USA, & Australia
    • May be associated with adaptation to rodents and mustelids: N501T in ferrets (Richard et al. Nature Comm.) and mink (Welkers et al. Virus Evolution); N501Y in mice (Gu et al. Science)
      • Some have speculated of risk of a persistent reservoir in wild rodents/mustelids
    • May increase ACE2 binding Bloom Lab ACE2 binding website - in particular it is predicted to do this by increasing the time spent in the 'open' conformation (Teruel et al., bioRxiv)
    • N501Y was found in longitudinally-collected samples from an immunocompromised patient (Choi et al. NEJM)
    • In one study, sera from previously infected patients neutralised patients with S:501N and S:501Y equally (Xie et al., bioRxiv)

S:H69-

Figure of S:H69-

Figure made via GISAID
Note this figure shows both the 69 & 70 deletion.

Dedicated S:H69- Nextstrain build

Table and charts of mutation distribution

  • Defining mutations:
    • Nonsynonymous: S:H69- (nucleotides: C21767-, A21768-, T21769-)
  • S:H69-
    • This deletion has arisen at 3 times in 'recognised clusters': in the S:Y453F, S:N439K, and S:N501Y clusters (Kemp et al. bioRxiv (21 Dec)); and has additionally arisen more times outside of recognised clusters.
    • May alter the recognition by antibodies, possibly impacting some antibody-therapy treatments, or immunity (Kemp et al. medRxiv (19 Dec)).
    • In particular, the deletion is predicted structurally to 'tuck in' the Spike N-terminal domain (Kemp et al. bioRxiv (21 Dec))

Important: Currently this build detects only the deletion at position 69 in spike, as due to alignment/calling differences, detecting the deletion at position 70 is less reliable. However, they seem to be highly associated.

S:N439K

Figure of S:N439K

Figure made via GISAID

Dedicated S:N439K Nextstrain build

Table and charts of mutation distribution

S:Y453F

Figure of S:Y453F

Figure made via GISAID

Dedicated S:Y453F Nextstrain build

Table and charts of mutation distribution

  • Defining mutations:
    • Has appeared multiple times independently: each can be associated with different accompanying mutations
  • S:Y453F

S:S98F

Figure of S:S98F

Figure made via GISAID

Dedicated S:S98F Nextstrain build

Table and charts of mutation distribution

  • Defining mutations:
    • Nonsynonymous: S:S98F; N:P199L or ORF14:Q46* (overlapping reading frames); ORF3a:Q38R, G172R, V202L
    • Synonymous: C28651T
  • S:S98F
    • Mostly found in Belgium and the Netherlands - predominantly Belgium
  • Little else is known about this mutation. Please let me know if you have more information!

S:E484

Dedicated S:E484 Nextstrain build

Table and charts of mutation distribution

  • Defining mutations:

    • Has appeared multiple times independently: each can be associated with different accompanying mutations
  • S:E484

    • Primarily associated with the 501Y.V2 variant that arose in South Africa in the winter of 2020(Tegally et al., medRxiv), and a variant predominantly found in Brazil (de Vasconcelos et al., medRxiv), but has appeared independently numerous times around the world.
    • Mutations at S:E484 may significantly reduce convalescent serum neutralization (Greaney et al., medRxiv)
    • There has been a case of reinfection associated with S:E484K: a woman previously infected with a non-S:E484K variant of SARS-CoV-2 was later reinfected with a virus carrying the S:E484K mutation (Nonaka et al., PrePrints)
    • In one study co-incubating SARS-CoV-2 with convalescent plasma, neutralization was completely escaped at day 73 due to an S:E484K mutation (Andreano et al., bioRxiv)
  • Little else is known about this mutation. Please let me know if you have more information!

More information coming soon!

S:D80Y

Figure of S:D80Y

Figure made via GISAID

Dedicated S:D80Y Nextstrain build

Table and charts of mutation distribution

  • Defining mutations:
    • Nonsynonymous: S:D80Y; N:S186Y or ORF14:P33T (overlapping reading frames), D377Y; ORF1a:T945I, T1567I, Q3346K, V3475F, M3862I; ORF1b:P255T; ORF7a: R80I
    • Synonymous: G4960T, C6070T, C7303T, C7564T, C10279T, C10525T, C10582T, C27804T
    • Of full list of 18 nucleotide mutations, 15 are mutations to T (possibly related to APOBEC-like editing within host, see Simmonds, bioRxiv)
  • S:D80Y
    • At the opposite end of the loop 'tucked in' by the 69/70 deletion (hypothetical association). See S:H69- for more detail on the impact of 69/70 deletion.
    • Found in at least 10 countries across Europe
  • Little else is known about this mutation. Please let me know if you have more information!

S:A626S

Figure of S:A626S

Figure made via GISAID

Dedicated S:A626S Nextstrain build

Table and charts of mutation distribution

  • Defining mutations:
    • Nonsynonymous: S:A626S (G23438T)
    • Synonymous: (none)
  • S:A626S
    • Found widely across Europe, in at least 15 countries
  • Little else is known about this mutation. Please let me know if you have more information!

S:V1122L

Figure of S:V1122L

Figure made via GISAID

Dedicated S:V1122L Nextstrain build

Table and charts of mutation distribution

  • Defining mutations:
    • Nonsynonymous: S:V1122L (G24926T)
    • Synonymous: (none)
  • S:V1122L
    • Found primarily in Sweden and northern European countries, including Norway and Denmark
  • Little else is known about this mutation. Please let me know if you have more information!