nextstrain/seasonal-flu

Integrating FASTA Sequence into Influenza A/H3N2 Evolution Analysis and Visualizing in Nextclade

sekhwal opened this issue · 6 comments

I'm looking to integrate my FASTA sequence into the evolutionary analysis of Influenza A/H3N2 using Nextclade and visualize it appropriately. While I utilized Nextclade for analysis, I encountered difficulties in adding the year information to the x-axis of the phylogenetic tree. Any suggestion would be appreciated.

Hi @sekhwal,

As @rneher stated in the discussion forum, Nextclade does not support time-scaled trees so you will have to run a full Nextstrain phylogenetic workflow to create the time-scaled tree.

We are currently lacking documentation on how to run the seasonal flu workflow with custom sequences. The easiest way to get started for now will be to follow the Quickstart with GISAID data.

However, I could not find EpiFlu" link in the top
navigation bar at GISAID (https://gisaid.org/). I am not sure if I have to
register at GISAID to get EpiFlu link.

You will need to register at GISAID in order to access and download data from them.

Also, please let me know how to get "profiles/gisaid/builds.yaml" and
please provide a template to prepare "builds.yaml" that would be great.

You can start with the existing profiles/gisaid/builds.yaml file in this repo.

I have some more follow-up questions.

  1. Downloading the sequences from GISAID takes very long time, also it allows only 20,000 sequences to download. Also, to run nextstrain build with the following command, where to provide downloaded fasta files as input file.

nextstrain build . --configfile profiles/gisaid/builds.yaml
--use-conda --conda-frontend mamba

  1. In addition, should I download "seasonal-flu" Github repo?

  2. In builds.yaml, do I need to change anythings in the following part? Where I should provide the metadata file?

reference: "config/h3n2/{segment}/reference.fasta"
annotation: "config/h3n2/{segment}/genemap.gff"
tree_exclude_sites: "config/h3n2/{segment}/exclude-sites.txt"
clades: "config/h3n2/ha/clades.tsv"
subclades: "config/h3n2/ha/subclades.tsv"
auspice_config: "config/h3n2/auspice_config.json"

Also, to run nextstrain build with the following command, where to provide downloaded fasta files as input file.

Following the Quickstart with GISAID data, please move your downloaded files to data/h3n2/metadata.xls and data/h3n2/raw_sequences_ha.fasta.

In addition, should I download "seasonal-flu" Github repo?

Yes, you will need to download the seasonal flu repo to run the workflow.

In builds.yaml, do I need to change anythings in the following part?

Try using the default values first to produce the build. Then if you would like to make adjustments, you can edit the parameters in the builds.yaml file.

Closing since the conversation has continued in #149.