This Nextflow workflow allows you to easily compute Sourmash sketches to make kmer-based comparisons of samples.
Note that it only uses the forward read of each sample to compute the sketch.
The default output directory is results
. It is possible to change output
directory by specifying --outdir FOLDERNAME
.
nextflow run ctmrbio/sourmash_sketch --reads 'path/to/reads/*_{1,2}.fq.gz'
This will use whatever environment you currently have activated. To run with conda
add -profile conda
to the command line.
nextflow run ctmrbio/sourmash_sketch --reads 'path/to/reads/*_{1,2}.fq.gz' -profile gandalf
Note that there is only a single -
in -profile
(this sends this argument to
Nextflow instead of the workflow).
Three types of output are produced, one folder containing the individual sample sketches, one comparison matrix in binary NumPy format, and some plots.