COmmand line arguments not overriding defaults

Question

COmmand line arguments not overriding defaults

Closed this issue 3 years ago · 23 comments

--cluster-queue-manager slurm not overriding sge when supplied on command line

Answer 1 · 2021-02-24T21:29:26.000Z

Yeh I noticed this a while back but wanted to wait for a meeting with others before deciding on the option priority as I think this may need revisiting

Answer 2 · 2021-04-14T14:56:30.000Z

I have a different issue here in that the options in my .cgat.yml file are not overriding the sge default, however if I supply the arguments on the command line, then my pipelines run fine with slurm.

I am also having the issue (not sure whether it is related) that if I run my pipeline from a particular Conda environment, the pipeline statements don't seem to be run within that environment unless I add the following:
P.run(statement, job_condaenv = PARAMS["conda_env"])

cgatcore 0.6.7 py_0 bioconda
drmaa 0.7.9 py_1000 conda-forge

Answer 3 · 2021-04-14T15:07:58.000Z

Hmm that's strange that the .cgat.yml isn't overwriting the sge defaults. I am now on a slurm cluster and can debug to find the issue.

This is the opposite of what David has reported, yours seem to be overwriting through command line.

Regarding the conda env issue, that's strange and suggests that your environment may not being copied to the non-interactive shell on the cluster for some reason.

Answer 4 · 2021-04-14T15:56:55.000Z

Yes that's the strange thing that it's the opposite of what David is saying - command line seems to overwrite for me.

Answer 5 · 2021-12-13T22:07:13.000Z

@Acribbs I am still having this issue on the CCB server and I know several others who are as well, including @deevdevil88. Any thoughts?

Answer 6 · 2021-12-15T09:12:52.000Z

Hi Lucy,

Things are pretty busy at the moment with grant writing and paper submission. But I will try and take a look in next few days. In meantime, can you share your .cgat.yml file and a test dataset that highlights the issues?

Thanks

Answer 7 · 2022-02-09T15:29:01.000Z

This is the contents of my .cgat.yml file that is in my home directory - GitHub doesn't support the yml file ending.

cluster:
  queue_manager: slurm
  queue: batch

I have to run my pipelines as follows:
python pipeline.py --cluster-queue-manager slurm --cluster-queue batch make full

I also have to specify the conda environment for each statement within P.run().

It works fine on the BMRC SGE server, so I think it is a Slurm issue.

I will send you an example mini pipeline and input files.

Answer 8 · 2022-02-09T15:46:12.000Z

Hi Adam!
similarly to what @lucygarner explains, i found that on my SLURM cluster , the worker node won't inherit the environment activated on the login node hence variables and packages are not being picked up altogether.

Our solution to this is to specify the conda environment in the .cgat.yml file (or separately in each pipeline.yml file, if different environments are needed for each pipeline) and change the P.run(...) commands in the pipeline.py to activate the conda env param, i.e. P.run(statement, job_condaenv = PARAMS["conda_env"])

I was hoping for a different workaround though, one that doesn't rely on having to rewrite the P.run() statements but defines the same conda env for all jobs.
Any idea?
thanks!!
fabiola

Answer 9 · 2022-02-09T15:59:39.000Z

@bio-la, are your pipeline scripts picking things up from the .cgat.yml file? I seem to be having the additional issue that the info in that file is not picked up.

Answer 10 · 2022-02-09T16:01:24.000Z

yes, my pipeline script picks PARAMS from .cgat.yml as well as pipeline.yml
I wasn't aware of this until charlotte pointed that out to me and it works, just tested that.

Answer 11 · 2022-02-09T19:13:03.000Z

Hi,

python pipeline.py printconfig may help troubleshoot the issue. Specifically there should be a section: List of .yml files used to configure the pipeline in the output with the list of yaml files being picked up and their associated priority to load them up (higher priority overrides lower ones)

I hope that helps.

Best regards,
Sebastian

Answer 12 · 2022-02-10T18:10:57.000Z

Thanks @sebastian-luna-valero. Very helpful! Interestingly it says it is giving my .cgat.yml file the highest priority.

List of .yml files used to configure the pipeline
 Priority   : File 
 2          : mapping_pipeline.yml 
 (highest) 1: /home/medawar/lgarner/.cgat.yml

This is the contents of my .cgat.yml in my home directory:

cluster:
  queue_manager: slurm
  queue: batch

But somehow when I run the pipeline, it tries to use sge unless I specify --cluster-queue-manager slurm.

This is what it looks like if I run the pipeline without --cluster-queue-manager slurm.

# 2022-02-10 18:11:19,990 INFO always_mount                            : False \
#                              cluster_memory_default                  : unlimited \
#                              cluster_memory_resource                 : None \
#                              cluster_num_jobs                        : None \
#                              cluster_options                         : None \
#                              cluster_parallel_environment            : None \
#                              cluster_priority                        : None \
#                              cluster_queue                           : None \
#                              cluster_queue_manager                   : sge \
#                              config_file                             : pipeline.yml \

Answer 13 · 2022-02-14T17:13:20.000Z

Hi,

Please paste the output of:

python pipeline.py printconfig | grep "cluster_queue"
python pipeline.py printconfig --cluster-queue-manager slurm | grep "cluster_queue"

I would like to compare the values of cluster_queue_manager in the lines of the output starting with and without #.

Best regards,
Sebastian

Answer 14 · 2022-02-14T17:41:27.000Z

Hi Sebastian,

I met with Lucy and I sorted out her issues (bashrc issues), but in the process realised that while the pipeline submits jobs to the slum correctly, it prints out values at the beginning of the pipeline as the default sge. I can take a look at this, but I suspect it may be displaying the params before they are updated.

Answer 15 · 2022-02-14T17:57:41.000Z

Hi @Acribbs ,
Sorry to open this again.
im also having the same problem as Lucy, so would be good to know what the fix is and what i need to add or remove from my bashrc to fix this.

Best
Devika

Answer 16 · 2022-02-14T18:02:14.000Z

Can you share your bashrc with me and I can see if it has the issue?

Answer 17 · 2022-02-14T18:05:46.000Z

@Acribbs . sure thing . here you go

Answer 18 · 2022-02-14T18:10:39.000Z

Hmm, there doesn't seem to be a major issue with your bashrc, however can you try commenting out the SHARED_TMPDIR and the DRMAA_LIBRARY and try running the pipeline. Was was you specific issue? You cannot submit jobs to the cluster?

Answer 19 · 2022-02-14T18:12:27.000Z

The issue that fixed @lucygarner was related her environment not inheriting variables in non interactive shells, fixed by:
if [[$PS1]]; then
#bash code
fi

Answer 20 · 2022-02-14T18:14:40.000Z

@Acribbs same issue as @lucygarner , that when i run cgatcore pipelines on the CBRG, it doesnt inherit the params from .cgat.yml file in my home directory. i m able to run the pipeline if specify the --cluster-queue-manager and --cluster-queue params in the pipeline make command when run the pipeline but not otherwise

Answer 21 · 2022-02-14T18:17:05.000Z

do you get drama API errors? Can you show me a failed log trace please and one that works?

Answer 22 · 2022-02-14T18:30:02.000Z

you know what i actually dint check running my pipelines recently without setting the --cluster-queue and --cluster-queue-manager since i last encountered the error back when we discussed this on the issue and seems like it works now. i magine it must be to do with my .bashrc originally and i guess back in May last year when we all used the same .bashrc for CCB based Charlies example, that must have sorted it out. But i dint realise. Sorry seems like it works now

Answer 23 · 2022-02-14T18:31:46.000Z

Ah brilliant!