metagenome-atlas/atlas

sbatch submission error with slurm

farhadm1990 opened this issue · 4 comments

Hello everyone,

I am new in shotgun metagenomics and am basically following the doc on atlas website to run through my reads. I am using a cluster with three main queues. And when I want to submit a job, e.g. 'atlas run qc profile cluster --jobs 99' in using sbatch, it returns "requested annotation" error that it failed to submit the job. However, when I run this on single machine, it works perfectly; the only problem is that I cannot use my terminal anymore.
Here you can see my cluster_config.ylam setup: my queue name is ghpc_v3 and am using slurm.

 __default__:
  queue: "ghpc_v3"
  #account: ""
  nodes: 1


#You can  overwrite values for specific rules
#rulename:
 #queue: long
  #account: ""
  #time_min:  # min
  #threads:

and here is my bash script for submiting via sbatch command:

#!/bin/bash
#SBATCH -p ghpc_v3
#SBATCH -n 1
#SBATCH --mem=300G 
#SBATCH -t 48:00:00

TMPDIR=/scratch/$USER/$SLURM_JOBID
export TMPDIR
mkdir -p $TMPDIR

export PATH="/usr/home/qgg/fpanah/miniconda3/bin:$PATH"
source ~/miniconda3/etc/profile.d/conda.sh
source ~/miniconda3/etc/profile.d/mamba.sh

source activate atlasenv
cd ~/data/ccd.wp2

atlas run qc --profile cluster --jobs 50 -n

cd $SLURM_SUBMIT_DIR
rm -rf /scratch/$USER/$SLURM_JOBID

I also checked the default and available queues on my cluster by sinfo and chose the best which works for other tasks of mine except atlas :(

I have also added a queue matching my queue to the queue.tsv file and commented out the other queues still didn't work:

# column names and units should be the same as in the key_mapping.yaml
# queue with lowest priority values is choosen first
queue   priority        threads mem_mb  time_min
ghpc_v3 1       32      350000  4320
#small  1       40      382000  4320
#large  2       1040    382000  4320
#hugemem        3       160     1534000 4320
#longrun        4       40      382000  20160
#hugemem_longrun        6       40      1534000 5760

here is what the error says. I have used dry run here:

[2022-09-27 10:46 INFO] Executing: snakemake --snakefile /usr/home/qgg/fpanah/miniconda3/lib/python3.7/site-packages/atlas/Snakefile --directory /usr/home/qgg/fp           anah/data/ccd.wp2 --jobs 50 --rerun-incomplete --configfile '/usr/home/qgg/fpanah/data/ccd.wp2/config.yaml' --nolock  --profile cluster --use-conda --conda-prefi           x /usr/home/qgg/fpanah/data/ccd.wp2/metagenomics_seq/conda_envs --dryrun  --scheduler greedy  qc
OSErrorin line 283 of /usr/home/qgg/fpanah/miniconda3/lib/python3.7/site-packages/atlas/Snakefile:
Error in annotations requested, check config file 'annotations'
  File "/usr/home/qgg/fpanah/miniconda3/lib/python3.7/site-packages/atlas/Snakefile", line 294, in <module>
  File "/usr/home/qgg/fpanah/miniconda3/lib/python3.7/site-packages/atlas/Snakefile", line 283, in get_genome_annotations
[2022-09-27 10:46 CRITICAL] Command 'snakemake --snakefile /usr/home/qgg/fpanah/miniconda3/lib/python3.7/site-packages/atlas/Snakefile --directory /usr/home/qgg/           fpanah/data/ccd.wp2 --jobs 50 --rerun-incomplete --configfile '/usr/home/qgg/fpanah/data/ccd.wp2/config.yaml' --nolock  --profile cluster --use-conda --conda-pre           fix /usr/home/qgg/fpanah/data/ccd.wp2/metagenomics_seq/conda_envs --dryrun  --scheduler greedy  qc   ' returned non-zero exit status 1.

Much appreciated in advance.

Kinds,
Farhad

The error is not related to the cluster submission. I know there are multiple config.yaml.

In your atlas working directory there should be the atlas config.yaml In there should be a section annotations:
It should ressemble this one:

annotations:
  - gtdb_tree
  - gtdb_taxonomy
  - genes
  - kegg_modules
  - dram
  • Not to me make this error message clearer

Hi Silas,

Thanks for your reply. My QC step is completed now. I have actually checked that part in my config.yaml and it was okay, should it be two spaces indented or a tab? or it doesn't matter?
However, I could submit atlas run all via cluster without any errors :) so probably there was nothing wrong with my cluster_config and config.yaml and their communication with my cluster. For my interest, I am wondering why should qc run annotation rule?

So yes it does matter and the format should be 2 spaces. But usually an somewhat sofisticated error translates a tap to 2 spaces.
Also if there would be something wrong with the file there would be an other error.

You said you run only atlas qc and you got this error ?

So indentations were right, I already checked them before running qc. And yes, I got the error only when I submitted 'atlas run qc --profile cluster' via sbatch submission. Due to the error, I ran it on login mode/front node, and it actually submitted the jobs in order and finished it in 20 h or so.
Now 'atlas run all --profile cluster --jobs 50' got submitted to cluster by sbatch and runs okay so far :)