sbx_WGS is an extension for the sunbeam pipeline for de novo microbial genome assembly. This pipeline uses SPAdes for single genome assembly and CheckM and read map coverage of the assembled genome for quality assessment (sbx_SPARCQ.rules). In addition, this pipeline uses hmmer to identify SCCGs (sbx_SCCG.rules). FASTQ files for Mycoplasma genitalium are provided for testing.
For sunbeam v3 and up,
- sunbeam extend https://github.com/PennChopMicrobiomeProgram/sbx_WGS.git
For sunbeam v2 and earlier,
-
git clone https://github.com/PennChopMicrobiomeProgram/sbx_WGS.git extensions/sbx_WGS
-
Add config.yml to sunbeam_config.yml
cd extensions/sbx_WGS
cat config.yml >> /path/to/sunbeam_config.yml
-
Recommended for cluster: include the memory specifications in cluster.json, especially for checkm_tree rule
-
For running larger assemblies on a cluster, export the TMPDIR variable so anvio knows what directory to use to set up its databases (see run_example.sh)
threads (SPAdes, BWA, samtools): # of threads to use for running these programs
checkm_yml (optional): YAML file containing a sample:rank and sample:taxon dictionary for CheckM parameters (see example); rank is one of {life,domain,phylum,class,order,family,genus,species}; taxon is the taxonomic classification for the specified 'rank'
window_size (Read mapping): define the window size to do sliding window coverage
sampling (read mapping): define minimum length of the contig to slide over
metagenome: option for running this pipeline on metagenomic samples
The R markdown template to display the results from this pipeline and for analysis can be found in the Analysis_templates repository on Enterprise
- Use number of SCCGs to evaluate contamination and completeness in genomes
- Find method for taxonomic assignment for WGS
- Remove rules for read mapping; modify sbx_mapping_withFilter to take assembled genomes from this pipeline
This extension was adopted from pipelines written by Scott Daniel, Jung-Jin Lee, Ceylan Tanes, and Louis Taylor. Spades rules were adopted from sbx_spades (https://github.com/sunbeam-labs/sbx_spades). Read mapping rules were adopted from the sunbeam pipeline (https://github.com/sunbeam-labs/sunbeam/tree/stable/rules/mapping) and sbx_mapping_withFilter (https://github.com/ctanes/sbx_mapping_withFilter). Anvio rules were adopted from anvio_pan (https://github.com/junglee0713/anvio_pan)