COMBINE-lab/salmon

Salmon quant with --recoverOrphans fails without warning

Opened this issue · 0 comments

Is the bug primarily related to salmon (bulk mode) or alevin (single-cell mode)?
Salmon (bulk mode)
Describe the bug
A clear and concise description of what the bug is.
Salmon fails without warning when using --recoverOrphans as part of quasi mapping. Dropping --recoverOrphans allows for job to be completed. Salmon exits with a nonzero exit code: 9 otherwise (shows as 9:0 with squeue). This also may be related to #929

To Reproduce
Steps and data to reproduce the behavior:
https://combine-lab.github.io/alevin-tutorial/2019/selective-alignment/
to generate our index

SLURM script

#!/bin/bash
#SBATCH --chdir=/path/to/working/dir/
#SBATCH --partition=short
#SBATCH --job-name=Salmon
#SBATCH --error=/path/to/logs/%x_%j.err
#SBATCH --output=/path/to/logs/%x_%j.out
#SBATCH --ntasks=6
#SBATCH --time=02:00:00
#SBATCH --cpus-per-task=2
#SBATCH --mem-per-cpu=30G
module load parallel # parallel/20150822-GCC-4.9.2
module load Anaconda3/2022.05
conda activate Salmon

parallel --jobs 6 --header : --colsep ',' \
   'salmon quant -I /path/to/index/folder/ \
   -l A\
   -1 /path/to/"{fastq_1}" \
   -2 /path/to/"{fastq_2}"\
   --writeUnmappedNames \
   --validateMappings \
   --recoverOrphans\
   --gcBias \
   --seqBias \
   --recoverOrphans\
   -o /path/to/output/{Samples} \
   --threads 2' :::: /path/to/sheet_with_sample_and_fastq_names.csv

Specifically, please provide at least the following information:

  • Which version of salmon was used?
    Both 1.10.2 and 1.10.3 were tested.

  • How was salmon installed (compiled, downloaded executable, through bioconda)?
    Used bioconda

  • Which reference (e.g. transcriptome) was used?
    GRCh38

  • Which read files were used?
    Illumina NovaSeq. Merged fastq based on direction (fastq split across lanes and had to add top off data) with zcat, used cutadapt for adapter trimming.

  • Which which program options were used?
    Ribodetector was used to get rid of rRNA contamination. Used output of non rRNA files with Salmon quant

Expected behavior
A clear and concise description of what you expected to happen.
Salmon Quant outputting of .sf files

Screenshots
If applicable, add screenshots or terminal output to help explain your problem.
From SLURM generated error file

Desktop (please complete the following information):

  • OS: [e.g. Ubuntu Linux, OSX]
  • Version [ If you are on OSX, the output of sw_vers. If you are on linux the output of uname -a and lsb_release -a]
    HPCS: Red Hat Server 7.9

Additional context
Add any other context about the problem here.
Removal of --recoverOrphans leads to jobs finishing to completion.

Oddly enough, with --recoverOrphans dropped, I start seeing output into .err files I set in SLURM rather than in the .log file with each folder. .err files typically terminate after reporting hits for frags are finished unlike with salmon_output.log files without --recoverOrphans

As an aside, when googling "recover orphans salmon crash" this was the top result: https://ksltv.com/635908/tens-of-thousands-of-live-salmon-fell-off-a-truck-in-oregon-and-into-a-creek/