/DNA-metabarcoding-submission-to-SRA

Documentation on how DNA metabarcoding sequencing data was submitted to SRA

Primary LanguageRMIT LicenseMIT

DNA metabarcoding submission to the Sequence Read Archive (SRA)

[to be updated] As it is true for many other research projects that are externally funded, I had to submit the DNA metabarcoding data and make it public available and useable for future research. But I could not seem to find a guide or even a blog that documented the process. So I decided to write some documentation on how DNA metabarcoding sequencing data was submitted to SRA using the submission portal.

First of all, being a biologist, you do not necessarily come from a medical research background or have the same research terminology as medical research. In medical research however, the word metabarcoding is not really used - instead the word metagenomics is used. Ecologists usually distinguish between metabarcoding (marker based) and metagenomics (shotgun sequencing), where the first is amplifying a small region of the DNA which is conservative enough to used it for a specific organism group, but variable enough to distinguish species. The latter methods (which I am not practically familiar with), combines small fragments into full genomes of a single organism (maybe multiple?).

So the take home message is - if you want to submit metabarcoding data to SRA, remember that the term metagenome (in some cases) is the same as metabarcoding.

1. Submit a BioProject

First step is to create a BioProject.

Submitter

Enter submitter information, in this case info about me :bowtie:.

Project type

For this information, I found a project that sounded sort of similar and filled out the information based on what they had done.

  • Project data type: Targeted loci environmental
  • Sample scope: Environment

Target

My attempt of the most descriptive title possible.

  • Environmental sample name: Arthropod CO1 DNA metabarcoding of flying insect bulk samples

General info

  • When should this submission be released to the public?: Release on specified date
  • Public description: The goal of this study was to characterize flying insect diversity using DNA metabarcoding (targeting the cytochrome c oxidase subunit 1 mitochondrial gene (COI)) of bulk insect samples collected with rooftop mounted car nets that sampled >250 five km routes in Denmark and Germany in June and July 2018 and 2019. DNA extracted from dried bulk insect samples were amplified with two COI primer pairs, fwhF2+ fwhR2n (Vamos, Elbrecht and Leese, 2017) and ZBJ-ArtF1c + ZBJ-ArtR2c (Zeale et al., 2011) and a 16S marker, Inse01 (Taberlet et al., 2018), for the purpose of comparing the efficacy of these primer sets used individually and in combination.
  • Relevance: Environmental

Before submitting, I added an external link to project webpage, mentioned Aage V. Jensens Naturfond as funding organisation and added a link to a published paper using some of the data.

2. Submit BioSamples (batch sample submission)

BioSample type

I am not completely sure which BioSample template to choose. For the BioProjects and associated BioSamples I found on SRA, it seems like people use Metagenome or environmental; version 1.0. Since I will submit to SRA, I will go with that package as well.

It seems there are a limited amount of columns that needs to be uploaded for most of the templates: sample_name, organism, collection_date, geo_loc_name, Lat_lon. There is a good general guide and webinar from NCBI that explains the data types etc.

The data compiling is documented in script XX.

BioSample attributes

Check: should organism be Arthropoda environmental sample Taxonomy ID: 260574? It seems other submission with similar data have used that organism disclaimer.

3. Submit metadata

References

  • Vamos E, Elbrecht V, Leese F (2017) Short COI markers for freshwater macroinvertebrate metabarcoding. Metabarcoding and Metagenomics 1: e14625. https://doi.org/10.3897/mbmg.1.14625
  • Zeale, M. R., Butlin, R. K., Barker, G. L., Lees, D. C., & Jones, G. (2011). Taxon‐specific PCR for DNA barcoding arthropod prey in bat faeces. Molecular ecology resources, 11(2), 236-244. https://doi.org/10.1111/j.1755-0998.2010.02920.x
  • Taberlet, P., Bonin, A., Zinger, L., & Coissac, E. (2018). Environmental DNA: For biodiversity research and monitoring. Oxford University Press.