number of cpus detection :: possible oversubcribe problems
Opened this issue · 2 comments
EricDeveaud commented
Hello,
centrifuge_evaluate.py
uses multiprocessing.cpu_count()
to ge the number of cpus.
but multiprocessing.cpu_count()
return the number of cpu in the machine, But this is not the same as the number of cpu available to the process. For example, you can run in a taskset context or a batch scheduler like slurm.
see:
$ nproc
96
$ taskset -c 1 nproc
1
$ taskset -c 1 python3 -c "import multiprocessing; print(multiprocessing.cpu_count())"
96
I would suggest to use len(os.sched_getaffinity(0)) instead of multiprocessing.cpu_count()
$ python3 -c "import os; print(len(os.sched_getaffinity(0)))"
96
$ taskset -c 1 python3 -c "import os; print(len(os.sched_getaffinity(0)))"
1
NB Mac OSX python does not have os.sched_getaffinity
so a portable way to code it would be
try:
num_cpus = len(os.sched_getaffinity(0))
except AttributeError:
num_cpus = multiprocessing.cpu_count()
regards
Eric
regards
Eric
EricDeveaud commented
os.sched_getaffinity
only available in python3
EricDeveaud commented
for python 2
a solution will be:
import re
try:
m = re.search(r'(?m)^Cpus_allowed:\s*(.*)$', open('/proc/self/status').read())
if m:
num_cpus = bin(int(m.group(1).replace(',', ''), 16)).count('1')
except IOError:
num_cpus = os.cpu_count()
regards
Erci