pipelines and scripts for processing the 50 genomes data for sadacc
- separate data processing from data summarising/reportingin workflows
- referenceFasta should have a more informative name, and use named keys not numbers to access files
- change the save/output steps such thatthey use publishDir + patterns
- all processes should use templates, i.e. follow the rule: 1 process = 1 template
- make all design decisions consistent across the repo
- simple descriptions of the workflow for humans to follow, inc dataflow diagrams
- CHPC has severe limitations on the number of jobs that can be running simultaneously for a given user. Therefore each script needs to adjusted to run in 'local' mode on a compute node