Oshlack/JAFFA

how to pass qin=33 flag to bpipe command (JAFFAL)?

Closed this issue · 2 comments

I have encountered the following: some of my data (ONT) seem to have an issue with phred base quality encoding, the pipeline fails with this error:

Warning! Changed from ASCII-33 to ASCII-64 on input Z: 90 -> 59
Up to 2 prior reads may have been generated with incorrect qualities.
If this is a problem you may wish to re-run with the flag 'qin=33' or 'qin=64'.

/...  more text/

Exception in thread "main" java.lang.RuntimeException: ReformatReads terminated in an error state; the output may be corrupt.
	at jgi.ReformatReads.process(ReformatReads.java:1103)
	at jgi.ReformatReads.main(ReformatReads.java:43)

When I run the process itself with qin=33, it completes successfully:

java -ea -Xmx200m -cp /home/epi2melabs/JAFFA/tools/bbmap/current/ jgi.ReformatReads ignorebadquality=t in=in.fastq out=tst3.all.fasta threads=4 qin=33

However, I'd like to pass qin=33 to the pipeline call. I tried:

/home/epi2melabs/JAFFA/tools/bin/bpipe run -n 4  -p jaffa_output="tst_dir_out" -p refBase=JAFFA_gencode_44  -p genome=hg38  -p annotation=genCode44 -p fastqInputFormat="*.fastq" -p qin=33  /home/epi2melabs/JAFFA/JAFFAL.groovy  in.fastq || :

but this does not work, i.e. the pipeline fails with the same error as originally. Is it anything I am doing wrong (not really familiar with bpipe), and how to fix this?

Running JAFFA version 2.3_dev

Apart from editing JAFFA_stages.groovy directly, the easiest approach would be to modify the reformat variable to "reformat qin=33". Ie. JAFFA takes the program name as a variable to you can add on additional parameters to this. e.g.:
/home/epi2melabs/JAFFA/tools/bin/bpipe run -n 4 -p jaffa_output="tst_dir_out" -p refBase=JAFFA_gencode_44 -p genome=hg38 -p annotation=genCode44 -p fastqInputFormat="*.fastq" -p reformat="reformat qin=33" /home/epi2melabs/JAFFA/JAFFAL.groovy in.fastq

Alternatively (if you think you will always want this, you can edit tools.groovy:
reformat="/path_to_JAFA/JAFFA/tools/bin/reformat qin=33"

As an aside this can also be useful to do things like:
R="module load R; R"

@nadiadavidson Thank you! This is a very useful tip! Let's hope new ONT flow cells & chemistry will fix these base calling issues.