This workflow is developed by Brian Foster at JGI and original from his repo. It take paired-end reads runs error correction by bbcms (BBTools). The clean reads are assembled by MetaSpades. After assembly, the reads are mapped back to contigs by bbmap (BBTools) for coverage information.
Description of the files:
.wdl
file: the WDL file for workflow definition.json
file: the example input for the workflow.conf
file: the conf file for running Cromwell..sh
file: the shell script for running the example workflow
-
fastq (illumina paired-end interleaved fastq)
-
contig prefix for fasta header
-
project name
-
resource where run the workflow
-
informed_by
-
memory (optional) ex: "jgi_metaASM.memory": "105G"
-
threads (optional) ex: "jgi_metaASM.threads": "16"
{
"jgi_metaASM.input_file":"/global/cfs/projectdirs/m3408/ficus/11809.7.220839.TCCTGAG-ACTGCAT.fastq.gz",
"jgi_metaASM.rename_contig_prefix":"503125_160870",
"jgi_metaASM.proj":"nmdc:503125_160870",
"jgi_metaASM.resource": "NERSC -- perlmutter",
"jgi_metaASM.informed_by": "nmdc:xxxxxx",
"jgi_metaASM.memory": "105G",
"jgi_metaASM.threads": "16"
}
Below is a part list of all output files. The main assembly contigs output is in final_assembly/assembly.contigs.fasta.
├── bbcms
│ ├── berkeleylab-jgi-meta-60ade422cd4e
│ ├── counts.metadata.json
│ ├── input.corr.fastq.gz
│ ├── input.corr.left.fastq.gz
│ ├── input.corr.right.fastq.gz
│ ├── readlen.txt
│ └── unique31mer.txt
├── final_assembly
│ ├── assembly.agp
│ ├── assembly_contigs.fna
│ ├── assembly_scaffolds.fna
│ └── assembly_scaffolds.legend
├── mapping
│ ├── covstats.txt (mapping_stats.txt)
│ ├── pairedMapped.bam
│ ├── pairedMapped.sam.gz
│ ├── pairedMapped_sorted.bam
│ └── pairedMapped_sorted.bam.bai
└── spades3
├── assembly_graph.fastg
├── assembly_graph_with_scaffolds.gfa
├── contigs.fasta
├── contigs.paths
├── scaffolds.fasta
└── scaffolds.paths