ribodetector_cpu hangs with SLURM
gianfilippo opened this issue · 4 comments
I tried your package on an interactive SLURM session, and it worked.
I then tried to submit it as a job via SLURM and it hangs at
2023-03-09 16:13:36 : INFO Using high MCC model file: /home/conda_envs/ribodetector/lib/python3.9/site-packages/ribodetector/data/ribodetector_600k_variable_len70_101_epoch47.onnx on CPU
I already tried to reinstall and nothing changes.
The command I issued in both sessions is
ribodetector_cpu -t 8 -l 92 -i $FASTQ1.fq.gz $FASTQ1.fq.gz -e rrna -o $outFASTQ1.nonrrna.1.fq $outFASTQ2.nonrrna.2.fq
What can I do ?
Could you post your SLURM script or command used to submit the job? You need to specify --cpus-per-task to the number you CPU cores you need and set --threads-per-core to 1.
I'm running into the same issue here. I submit it with sbatch
, and it runs within a singularity container from here.
At the start there are two active processes on the node, and after 5 mins, there's nothing going on anymore..
This is my script:
#!/usr/bin/env bash
#SBATCH --time=1-00:00:00
#SBATCH --mem-per-cpu=4G
#SBATCH --cpus-per-task=12
#SBATCH --threads-per-core=1
cd /workdir
MEAN_READ_LENGTH=`zcat results/fastp/MP_35_R1_trimmed.fastq.gz | head -1000 | awk '{if(NR%4==2) {count++; bases += length} } END {print int(bases/count)}' || true`
echo "Estimated read length: $MEAN_READ_LENGTH"
singularity exec containers/ribodetector_0.2.7-cpu.sif \
ribodetector_cpu \
--threads "$SLURM_CPUS_PER_TASK" \
--input results/fastp/MP_35_R1_trimmed.fastq.gz results/fastp/MP_35_R2_trimmed.fastq.gz \
--output results/ribodetector/MP_35_R1.fastq.gz results/ribodetector/MP_35_R2.fastq.gz \
--rrna results/ribodetector/MP_35_R1_rrna.fastq.gz results/ribodetector/MP_35_R2_rrna.fastq.gz \
--ensure rrna
It works now. The issue was not setting --chunk_size
which led to memory issues.
It works now. The issue was not setting
which led to memory issues.RTFM.....
It is great that you figured out the solution. This will be beneficial to other users. Will incorporate this into the FAQ in README.