Snakemake-Profiles/htcondor

Submit requrement ReqCpus evaluated to non-boolean

Closed this issue · 2 comments

Hi, I'm very new to HTCondor.

When I run my program I get this error:

Traceback (most recent call last):
  File "~/.config/snakemake/htcondor/grid-submit.py", line 36, in <module>
    clusterID = sub.queue(txn)
  File "/anaconda3/envs/myenv/lib/python3.7/site-packages/htcondor/_lock.py", line 99, in __exit__
    return self.cm.__exit__(*args, **kwargs)
  File "/anaconda3/envs/mge-project/lib/python3.7/site-packages/htcondor/_lock.py", line 69, in wrapper
    rv = func(*args, **kwargs)
htcondor.HTCondorIOError: Failed to commit and disconnect from queue. SCHEDD:2:Submit requirement ReqCpus evaluated to non-boolean.
|SCHEDD:2:Submit requirement ReqCpus evaluated to non-boolean.

I tried to solve it by adding some permutations to the 'sub' variable with the name ('ReqCpus', 'requestcpus', 'cpus', 'totalcpus', etc) and the values ('1', '0', 'true', 1, 0, False, etc) to no avail. I also could not find this error online.

Any suggestions?

Hello nongiga,

do you know where the "ReqCpus" comes from? In the submit code in this profile I only see request_cpus. I had problems with the submit script on another cluster and therefore also have this alternative grid-submit.py you could try out:

#!/usr/bin/env python3

import sys
from os import makedirs
from os.path import join
from uuid import uuid4
from subprocess import check_output

from snakemake.utils import read_job_properties


jobscript = sys.argv[1]
job_properties = read_job_properties(jobscript)

UUID = uuid4()  # random UUID
jobDir = '/afs/cern.ch/work/j/jheuel/htcondor/{}_{}'.format(job_properties['jobid'], UUID)
makedirs(jobDir, exist_ok=True)

sub = {
    'universe':     'vanilla',
    'executable':   '/bin/bash',
    'arguments':    jobscript,
    'max_retries':  '5',
    'log':          join(jobDir, 'condor.log'),
    'output':       join(jobDir, 'condor.out'),
    'error':        join(jobDir, 'condor.err'),
    'getenv':       'True',
    'request_cpus': str(job_properties['threads']),
}

request_memory = job_properties['resources'].get('mem_mb', None)
if request_memory is not None:
    sub['request_memory'] = str(request_memory)

max_run_time = job_properties['resources'].get('max_run_time', None)
if max_run_time is not None:
    sub['+MaxRunTime'] = str(max_run_time)

submitScript = join(jobDir, 'job.submit')
with open(submitScript, 'w') as f:
    for k, v in sub.items():
        f.write(f'{k} = {v}\n')
    f.write('queue')

output = check_output(['condor_submit', submitScript])

# example output:
# 1 job(s) submitted to cluster 6669398.

clusterID = int(str(output).split('cluster')[-1].split('.')[0])


# print jobid for use in Snakemake
print('{}_{}_{}'.format(job_properties['jobid'], UUID, clusterID))

Hi,

Thank you for the speedy reply. It turned out that I got this error because the system administrator configured my account wrong. Once that was solved snakemake ran smoothly

Thank you so much for this profile! I never expected it to be so smooth moving snakemake from a local machine to a cluster.