happykhan/ronaQC

Datasets as a worked example

Closed this issue · 2 comments

Create some docs, use these Datasets as a worked example:

  • Failed QC
  • VOI/VOC lineages

from https://github.com/CDCgov/datasets-sars-cov-2

Fetching datasets

conda create -n datasets-sars-cov-2 -c conda-forge -c bioconda uscdc-datasets-sars-cov-2

conda activate datasets-sars-cov-2

export NCBI_API_KEY="<your-NCBI-API-key-here>"

GenFSGopher.pl --numcpus 8 --compressed --outdir vocvoi-dataset ~/miniconda3/envs/datasets-sars-cov-2/share/uscdc-datasets-sars-cov-2/sars-cov-2-voivoc.tsv

GenFSGopher.pl --numcpus 8 --compressed --outdir failedQC-dataset ~/miniconda3/envs/datasets-sars-cov-2/share/uscdc-datasets-sars-cov-2/sars-cov-2-failedQC.tsv

Data sets available on zenodo.

DOI Test data is available here: https://zenodo.org/record/7018405