neuro-inc/flow-template

cpu resources of gpu-large and gpu-small presets

Opened this issue · 4 comments

Does a job with gpu-large preset have less cpu resources than a job with gpu-small preset?

It looks so. The output of multiprocessing.cpu_count() shows only 8 available cpu in the system for gpu-large preset, but for gpu-small it shows 32 available cpu. There are screenshots:

for gpu-large preset
Screenshot from 2020-01-07 19-03-29

for gpu-small preset
Screenshot from 2020-01-07 19-10-42

Why are the outputs different for these presets? What does #CPU column mean in the output of the command neuro config show? I thought that column shows how many cpu resources were set (e.g. like here: neuro submit --cpu 2 ...), but as I can see it does not.
Screenshot from 2020-01-07 19-11-48

https://docs.python.org/3/library/os.html#os.cpu_count

This number is not equivalent to the number of CPUs the current process can use. The number of usable CPUs can be obtained with len(os.sched_getaffinity(0))

8 and 32 - is host cpu count. #CPU column is available for job cpu count.
I.e. host contain 32 cpu and 4 tesla k80 gpu. We can run there up to 4 gpu-small job.

Our configuration depends from cloud provider instance configuration.

Maybe, I am wrong, but it looks like I can use all 32 cpu cores in a job with gpu-small preset, but with 1/4 performance. I thought the correct way to have 4 jobs on one node is not to share cpu cores between jobs.

Maybe, I just misled by htop output:
htop output before script run
Screenshot from 2020-01-10 11-16-50
htop output after script run
Screenshot from 2020-01-10 11-10-56

I also achieved the highest performance of dataset processing with n_jobs = 16 or 32.

@atselousov ,you are right.
7.0 CPU limit is total loading per all cores. In you screenshots 7.0 / 32 core = 21.88% per core.

@mariyadavydova, do we need to clarify it in documentation somewhere?
Do we need show host cpu count in preset information too?

@dalazx, we can use integers for cpu requests/limits. In this case exclusive cores will be assigned to task instead shared: https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/ . It can be used i.e. for some linear calculation(cpu=1). Do we need this functionality?
There some performance comparison https://kubernetes.io/blog/2018/07/24/feature-highlight-cpu-manager/

@shagren we could potentially expose an option, but I am not sure this is needed indeed.