nextstrain/seasonal-flu

Annotate more granular clades on nextflu site

huddlej opened this issue · 0 comments

Context
We currently define clades for all Nextstrain flu builds using the same manually curated clade definition files. However, for surveillance purposes, it would be helpful to have a way to identify and talk about new clades before they have reached high enough frequency to get manually annotated.

Description
Assign more granular clade ids to all clades in trees that have at least one amino acid mutation.

Possible solution
Consider using the find_clades.py script (or something like it) from the flu-forecasting project to automatically assign a unique id to each distinct amino acid haplotype. These haplotypes could span all of HA or could be focused on HA1. These annotations could be added just to the nextflu private builds, initially, to avoid confusion with the manually curated clades. Alternately, we could include a description of how these clades are identified, so the annotations would make more sense in the public builds.