eth-cscs/sarus

Too many cpus requested

haampie opened this issue · 2 comments

On my pc with a AMD Ryzen 7 3700X and Linux 5.4.0 I'm facing an issue with the number of requested CPUs being too large which results in the container failing to start.

The generated config.json for runc contains ... "linux":{"resources":{"cpu":{"cpus":"0-31"}} ... which indeed corresponds to the Cpus_allowed_list:

$ cat /proc/self/status | grep Cpus_allowed_list
Cpus_allowed_list:	0-31

but I only have 8 cores / 16 threads:

$ nproc
16

The error I'm getting is

ERRO[0000] container_linux.go:349: starting container process caused "process_linux.go:297: applying cgroup configuration for process caused \"failed to write \\\"0-31\\\" to \\\"/sys/fs/cgroup/cpuset/container-ccxounuuahznpsds/cpuset.cpus\\\": write /sys/fs/cgroup/cpuset/container-ccxounuuahznpsds/cpuset.cpus: invalid argument\"" 

If I hard-code cpus to 0-15 everything is fine.

Hello @haampie, we are aware that the mechanism for assigning CPU affinity is not functioning properly in some situations.
A fix has been merged a couple of days ago in the development branch (c98a44e) and will be available in the next tagged release.

Ah, thanks, didn't notice that!