Options for GPU Sharing between Containers Running on a Workstation
frenchwr opened this issue · 5 comments
Describe the support request
Hello, I'm trying to understand options that would allow multiple containers to share a single GPU.
I see that K8s device plugins in general are not meant to allow a device to be shared between containers.
I also see from the GPU plugin docs in this repo that there is a sharedDevNum
that can be used for sharing a GPU, but I infer this is partitioning the resources on the GPU so each container is only allocated a fraction of the GPU's resources. Is that correct?
My use case is a tool called data-science-stack that is being built to automate the deployment/management of GPU-enabled containers for quick AIML experimentation on a user's laptop or workstation. In this scenario we'd prefer the containers have the ability to each have access to the full GPU resources - much like you'd expect for applications running directly on the host. Is this possible?
System (please complete the following information if applicable):
- OS version: Ubuntu 22.04
- Kernel version: Linux 5.15 (HWE kernel for some newer devices)
- Device plugins version: v0.29.0 and v0.30.0 are the versions I've worked with
- Hardware info: iGPU and dGPU
sharedDevNum
is mostly intended to be used when either:
- User running the workloads manually takes care of not oversubscribing the GPUs (e.g. in test cluster),
- Cluster runs only GPU workload(s) which can share GPU up to
sharedDevNum
(e.g. cluster dedicated to running a single GPU workload), or - Pod specs indicate how much resources each container uses from GPU, and cluster is running GAS [1] that shares GPUs based on the specified resource consumption
- There's no enforcement of limiting workload to the specified amount, those values are used only for scheduling / device selection
[1] GPU Aware Scheduling: https://github.com/intel/platform-aware-scheduling/tree/master/gpu-aware-scheduling
In this scenario we'd prefer the containers have the ability to each have access to the full GPU resources
If each container (in cluster) is supposed to have exclusive access to the GPU device, use 1
for sharedDevNum
.
If each container (in cluster) is supposed to have exclusive access to the GPU device, use 1 for sharedDevNum.
But this does not allow the GPU to be shared between containers, correct?
Maybe a bit more context about the use case would help. We are building an application that simplifies the deployment of GPU-enabled containers (for example, using Intel's ITEX and IPEX images). This is not meant for deployments across a clusters of nodes. There is just a single node (user's laptop or workstation).
Each container runs a Jupyter Notebook server. Ideally, a user could be on a workstation with a single GPU and multiple containers running, with each provided full access to the GPU. Notebook workloads are typically very bursty, so container A may run a notebook cell that is very GPU intensive while container B is idle. In cases where both containers are simultaneously requesting GPU acceleration, ideally that would be handled the same way (or close to the same way) as two applications running directly on the host OS requesting GPU resources.
@frenchwr sharedDevNum is the option you would most likely want. Any container requesting the gpu.intel.com/i915
resource with sharedDevNum > 1
will get "unlimited" access to the GPU. "unlimited" in a sense that there's no hard partitioning etc. in works. Obviously if two containers try to run on the same GPU, they will fight for the same resources (execution time, memory).
@tkatila Thanks for clarifying! I agree this sounds like the way to go. A few more follow up questions:
- Am I right that running with resource management disabled (default behavior) would make the most sense for our use case?
- Is there any performance impact on the number you choose for
sharedDevNum
? For example, using 2 vs. 10 vs. 100. I guess not if there is no partitioning of the GPU resources but just want to confirm. Is there any reason not to choose an arbitrarily large number if our goal is to expose the full GPU to each container?
- Am I right that running with resource management disabled (default behavior) would make the most sense for our use case?
Yes, that's correct, keep it disabled. To enable resource management you would also need another k8s component (GPU Aware Scheduling, or GAS). It's setup requires some hassle and I don't see any benefit from it in your case.
- Is there any performance impact on the number you choose for
sharedDevNum
? For example, using 2 vs. 10 vs. 100. I guess not if there is no partitioning of the GPU resources but just want to confirm. Is there any reason not to choose an arbitrarily large number if our goal is to expose the full GPU to each container?
I don't think we have any guide for selecting the number, but something between 10 and 100 would be fine. The downside with an extremely large number is that it might incur some extra CPU and network bandwidth utilization. GPU plugin will detect the number of GPUs, multiply the number with the sharedDevNum
and then create duplicate resources for the node. Carrying all those resources in resource registration and during scheduling will have some minor effect, but if the sharedDevNum
is within a sensible range, the effect shouldn't be noticeable.