fulcrumgenomics/dagr

Dagr should support task args with spaces and special characters

Closed this issue · 3 comments

This does not go well:

java -Xmx1g -jar target/scala-2.11/dagr-0.1.0-SNAPSHOT.jar \ 
  --config=./resources/application.conf \
  DnaResequencingFromFastqPipeline \
  --fastq1 r1.fq.gz \
  --fastq2 r2.fq.gz \
  --ref /Work/refseq/hs38DH/hs38DH.fa \
  -s 'My Stupid Sample' \
  -l 'Foo*Bar' \
  -p Nope\! \
  --tmp /tmp \
  -o ./breakage

It generates a script that contains the following:

run () {
  java -Dsamjdk.buffer_size=131072 -XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -Xmx4096m -jar /Work/dgx/packages/picard/picard.jar FastqToSam VALIDATION_STRINGENCY=SILENT CREATE_INDEX=true F1=/Work/dgx/data/test/r1.fq.gz F2=/Work/dgx/data/test/r2.fq.gz O=/dev/stdout SM=My Stupid Sample PL=ILLUMINA PU=Nope! LB=Foo*Bar STRIP_UNPAIRED_MATE_NUMBER=true SO=queryname \
    |  java -Dsamjdk.buffer_size=131072 -XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -Xmx1024m -jar /Work/dgx/packages/picard/picard.jar FifoBuffer VALIDATION_STRINGENCY=SILENT CREATE_INDEX=true BUFFER_SIZE=536870912 \
    |  java -Dsamjdk.buffer_size=131072 -XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -Xmx1024m -jar /Work/dgx/packages/picard/picard.jar MarkIlluminaAdapters VALIDATION_STRINGENCY=SILENT CREATE_INDEX=true I=/dev/stdin O=/tmp/unmapped.4895479567643351291.bam M=/Work/dgx/data/test/breakage/Foo*Bar.adapter_metrics.txt
}

To fix this we'll need to do two things:

  1. Change the code that builds up the script to either quote and escape every single argument (ugly), or detect which arguments need quoting/escaping and apply it just to those (more work, but a lot less ugly)
  2. Move the code that gloms together the args for all the tasks in a PipeChain out of PipeChain and into code that builds the scripts for execution (so as not to erroneously escape out the pipe characters).

This is a nice (though I think, incomplete since it misses backticks) reference. I think this amounts to detecting any of these characters in a string, and if present then:

  1. Escape single quotes in the string to \'
  2. Wrap the string in single quotes
nh13 commented

@tfenne is this done?

Yup.