cgat-developers/cgat-core

Changing time requested for job

Closed this issue · 7 comments

Hi,

I will move this issue here - CGATOxford/CGATPipelines#437.

I tried adding the following to my pipeline script before P.run:
job_options = " -t 10-00:00:00"

However, I got the following error:

drmaa.errors.InvalidAttributeFormatException: code 13: invalid date/time format: 10-00:00:00

I also tried job_options = " --time=10-00:00:00" but got the same error.

Any thoughts?

Best wishes,
Lucy

Hi Lucy,

Looking at:

It says:

Set a limit on the total run time of the job. [..] When the time limit is reached, the job will be killed.

is this what you are after?

From your request I understood that you are looking to start a job at a specified time instead. If that's the case, you want something like --begin=<time>.

Best regards,
Sebastian

Hi Sebastian,

Apologies for not explaining clearly. Yes, I am looking for the former - to set a limit on the total run time of the job. The default on our cluster is 7 days but my job takes longer than that so I am trying to increase it.

Best wishes,
Lucy

Hi Lucy,

Have a go and interact with slurm directly to check if it allows the specified date/time format. What happens when you run the below?

salloc  --time=10-00:00:00 --nodes=1 hostname

Best regards,
Sebastian

Hi @sebastian-luna-valero,

That gives me the following:

salloc: Pending job allocation 5040753
salloc: job 5040753 queued and waiting for resources
salloc: job 5040753 has been allocated resources
salloc: Granted job allocation 5040753
salloc: Waiting for resource configuration
salloc: Nodes cbrgwn022p are ready for job
cbrglogin1
salloc: Relinquishing job allocation 5040753

Best wishes,
Lucy

Hi Lucy,

With the following command you should be able to check to what slurm partitions you are able to submit your job:

sacctmgr show user --association

With the following command you should able to check the maximum allowed time that you can request on that partition:

scontrol show partition <partition>

check the value of MaxTime.

Say for example you get MaxTime=10-00:00:00, then you should be able to do:

 job_options = " -t 240:00:00"

instead of

 job_options = " -t 10-00:00:00"

to set the time limit for 10 days.

Please try and let us know.

Best regards,
Sebastian

Thank you @sebastian-luna-valero,

I think this is working:

def test():
    statement = '''sleep 10d'''

    job_options = " -t 240:00:00"

    P.run(statement)

If I check the job using scontrol show job <job-number>.

RunTime=00:01:18 TimeLimit=10-00:00:00 TimeMin=N/A

Our cluster has now changed the default run time to 1 hour, which is too short for a lot of my jobs. Is there a way to change the default -t option for the whole pipeline, rather than having to add job_options to each function?

Best wishes,
Lucy

Adding job_options: -t 240:00:00 to the YAML file seems to be working.