SciLifeLab/NGI-NeutronStar

Remis master To-do list

remiolsen opened this issue · 2 comments

For transparency, here's my "design document"


Main requirements

  • Nextflow? yes
  • Supernova running in $SNIC_TMP (Irma compatible?)
    • 1.20 compatible — multiple input parameter assemblies
    • [ ] Use nextflow publishdata in stead of rsync couldn't make it work. Use rsync!
    • Make a this optional
  • Rsync supernova assembly back to workdir
  • supernova mkoutput - pseudohap, megabubbles
    • gunzip
    • parameter of additional outputs — always output .phased.fasta
    • parameter for minimum length
  • QUAST
    • make it run on Irma
  • BUSCO
    • UPPMAX — beforeScript
  • MultiQC
    • Needs testing
  • support for --no-preflight flag
  • Documentation
    • Readme.md
  • dump software versions & commands that were run
  • Send mail when done pipeline is finished
  • Clean up and generalize the configs
    • Common HPC config
    • Common Uppmax config
    • Make a general local run config
  • Release tags

Docker / Singularity

  • Supernova (copyright issues?)
  • Quast
  • BUSCO
  • Script for automatic singularity/docker download / installation

NX script

  • input configuration:
    • id
      • fastqs
      • sample
      • maxreads
      • bcfrac
    • genomesize
  • memory parameter
  • cpu parameter
  • make Longranger / fastqc optional

Input_validation

  • id — only numbers, letters, dash, and underscore allowed
  • bcfrac (0,1)
  • maxreads - num

MultiQC

  • Fix when having empty molecule.yaml files
  • Does having “ASSEMBLER_CS” folders break multiqc?
  • Fix QUAST module. It breaks when running with -s option

Testing

  • Test data from NA12878 run.
  • Travis-CI integration

Could haves

  • Tigmint evaluation
  • Delivery template mail / output folder structure
  • BWA align
    • picard-tools
    • remove dups
    • collectinsertsize
  • qaTools-singularity
  • FRC-singularity
  • BUSCOv2 datasets in config
    • auto-script to download datasets
ewels commented

Re: software version numbers - just a note not to copy the RNA pipeline for this, but instead I'd recommend having a dedicated process as in the ChIPseq pipeline (and others). The RNAseq approach has been more of a pain.

ewels commented

This issue was moved to nf-core#1