NVIDIA GPU support

Question

NVIDIA GPU support

3XX0 opened this issue 8 years ago · 39 comments

Hello, author of nvidia-docker here.

As many of you may know, we recently released our NVIDIA Docker project in our effort to enable GPUs in containerized applications (mostly Deep Learning). This project is currently comprised of two parts:

A Docker volume plugin to mount NVIDIA driver files inside containers at runtime.
A small wrapper around Docker to ease the deployment of our images.

More information on this here

While it has been working great so far, now that Docker 1.12 is coming out with configurable runtime and complete OCI support, we would like to move away from this approach (which is admittedly hacky) and work on something which is better integrated with Docker.

The way I see it would be to provide a prestart OCI hook which would effectively trigger our implementation and configure the cgroups/namespaces correctly.

However, there are several things we need to solve first, specifically:

How to detect if a given image needs GPU support
Currently, we are using a special label com.nvidia.volumes.needed, but it is not exported as an OCI annotation (see #21324)
How to pass down to the hook which GPU should be isolated
Currently, we are using an environment variable NV_GPU
How to check whether the image is compatible with the current driver or not.
Currently, we are using a special label XXX_VERSION

All of the above could be solved using environment variables but I'm not particularly fond of this idea (e.g. docker run -e NVIDIA_GPU=0,1 nvidia/cuda)

So is there a way to pass runtime/hook parameters from the docker command line and if not would it be worth it? (e.g. --runtime-opt)

Answer 1 · 2016-06-23T23:14:10.000Z

+@cpuguy83 we briefly discussed that with you during DockerCon 16.

Answer 2 · 2016-07-06T07:10:45.000Z

ping @mlaventure @crosbymichael wdyt?

Answer 3 · 2016-07-08T02:45:56.000Z

Worth noting that opencontainers/runtime-spec#483 might affect this

Answer 4 · 2016-07-11T10:47:55.000Z

@3XX0 I guess another option would be to have a patched version of runc that has different options and knows about GPUs?

The image detection is a similar issue we have for multiarch in general, and there are flags on hub, but not well exposed yet. This might also work for driver version, let me find the spec.

Answer 5 · 2016-07-11T18:22:56.000Z

Yes I thought about it too but I would rather not have to track upstream runc and we would still hit the same problems since the runtime config in Docker is daemon-wide and static.

Thanks, It would be greatly appreciated.

Answer 6 · 2016-07-18T21:25:01.000Z

I just saw that #24750 was closed and redirected here.
I believe we could already have basic GPU support with Docker Swarm if we were able to add devices when using docker service create. Is it on the roadmap?

Related: NVIDIA/nvidia-docker#141

Answer 7 · 2016-07-19T09:01:26.000Z

@3XX0 @flx42 I am the original author of #24750 . I am not sure this question is appropriate to post, if not, please forgive me. If I just want to implement Docker Swarm to orchestrate Swarm cluster supporting GPU, not caring about whether to use native Docker or nvidia-docker, could you give some comments or suggestions here? Thanks in advance!

Answer 8 · 2016-07-19T15:29:06.000Z

@flx42 I think you may be able to bind mount in device nodes with docker service --mount but it is not very well documented yet as I think the CLI is still being finalised; I am fairly sure the API allows bind mounts though.

Answer 9 · 2016-07-19T15:30:17.000Z

--mount type=bind,source=/host/path,target=/container/path

Answer 10 · 2016-07-19T15:31:19.000Z

@cpuguy83 thanks!

Answer 11 · 2016-07-19T17:22:09.000Z

@justincormack @cpuguy83 Yes, in NVIDIA/nvidia-docker#141 I figured out I can mount the user-level driver files like this:

$ docker service create --mount type=volume,source=nvidia_driver_367.35,target=/usr/local/nvidia,volume-driver=nvidia-docker [...]

But, unless I'm missing something, you can't bind mount a device, it seems to be like a mknod but without the proper device cgroup whitelisting.

$ docker service create --mount type=bind,source=/dev/nvidiactl,target=/dev/nvidiactl ubuntu:14.04 sh -c 'echo foo > /dev/nvidiactl'
$ docker logs stupefied_kilby.1.2445ld28x6ooo0rjns26ezsfg
sh: 1: cannot create /dev/nvidiactl: Operation not permitted

It's probably similar to doing something like this:

docker run -ti ubuntu:14.04                      
root@76d4bb08b07c:/# mknod -m 666 /dev/nvidiactl c 195 255
root@76d4bb08b07c:/# echo foo > /dev/nvidiactl
bash: /dev/nvidiactl: Operation not permitted

Whereas the following works (well, invalid arg is normal):

$ docker run -ti --device /dev/nvidiactl ubuntu:14.04
root@ea53a1b96226:/# echo foo > /dev/nvidiactl
bash: echo: write error: Invalid argument

Answer 12 · 2016-07-19T17:40:57.000Z

@NanXiao regarding your question, please look at NVIDIA/nvidia-docker#141

Answer 13 · 2016-07-20T14:01:04.000Z

@flx42 ah yes, that would be an issue. Can you create a separate issue that you can't have a device in a service to track that specific problem?

Answer 14 · 2016-07-20T21:56:19.000Z

@justincormack created #24865, thank you!

Answer 15 · 2016-08-09T00:16:22.000Z

Few comments

How to detect if a given image needs GPU support

The way we handled this for multi-arch is by explicitly introducing the "arch" filed into the image. I would suggest that we introduce an "accelerator" field to address not only GPUs but in the future for FPGA and other accelerator.

On the compatibility check, I would implement it such that it is optional. A lot of applications can run with or without GPUs, if GPUs are there, they will take advantage of them but if they are not there, they will just run CPU only mode. If we make the driver check optional, it will make it easy to accommodate this requirement.

Answer 16 · 2016-09-14T01:53:30.000Z

Any update here? really look forward to a standard way to use accelerators in container :)

Answer 17 · 2016-11-12T04:34:31.000Z

See also kubernetes/kubernetes#19049 . k8s is going to release new version with GPU support.

Swarm is very good for our system (k8s has something we don't need). However, GPU is defintely a key feature and if Swarm doesn't have any clear plan for it we have to go with k8s :D

Answer 18 · 2017-01-11T10:30:03.000Z

Hey guys, would really love to use a GPU supported swarm. This issue is still open, so I guess its not clear if this will be on the roadmap or not?! Any news on this topic?

Answer 19 · 2017-01-11T10:50:55.000Z

We would love support. It is unfortunately a very complex issue. I know someone was looking at a simpler way, will see if it was viable. Contributions are welcome, it really needs a detailed design proposal of how to resolve the issues.

…

On 11 Jan 2017 10:30 a.m., "Martin" ***@***.***> wrote: Hey guys, would really love to use a GPU supported swarm. This issue is still open, so I guess its not clear if this will be on the roadmap or not?! Any news on this topic? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#23917 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAdcPB74d8IJOdocE_ygAImAeSxmQ1E8ks5rRK81gaJpZM4I9U2S> .

Answer 20 · 2017-01-11T11:06:39.000Z

thx for the update @justincormack

Answer 21 · 2017-01-11T23:39:50.000Z

@3XX0 @flx42

What are the low level steps to give a container access to the GPUs? I was looking though your codebase and seen there are devices that must be added and volumes but I was not sure what the volumes were being used for? Is that for the libs?

Other than placing the devices inside, are there any other settings that need to be applied?

Answer 22 · 2017-01-12T00:10:56.000Z

It's actually tricky, we explain most of it here but there are a lot of corner-cases to think about.

We've been working on a library to abstract all these things, the idea is to integrate it as a RunC prestart hook. We have most of it working now and will publish it soon. The only issue with this approach is that we have to fork RunC and rely on Docker --runtime option.

In the long term, we should probably think about a native GPU integration leveraging something like --exec-opt.

Answer 23 · 2017-01-12T00:18:01.000Z

@3XX0 if you want, we can work together on the runc changes together. I think it makes sense to have some support for GPU at that level and will make implementations much cleaner at the higher levels.

Answer 24 · 2017-01-12T00:26:36.000Z

It should be pretty straightforward, the only things that need to be done are:

Fix this opencontainers/runc#1044
Append some hooks in the spec

Once 1) is fixed and we have an option to add custom hooks from within Docker (e.g. exec-opt), we won't need the fork anymore (except for backward compatibility)

Answer 25 · 2017-01-12T00:31:05.000Z

@3XX0 why use hooks at all? Can you not populate everything in the spec itself to add the devices, add bind mounts, and give the correct permissions on the device level?

Answer 26 · 2017-01-12T00:43:05.000Z

@3XX0 if we have a docker run --gpus what would the input data look like?

Answer 27 · 2017-01-12T00:47:12.000Z

Our solution is runtime agnostic and hooks are perfect for that.
We also need to do more than what's exposed by the spec (e.g. update the library cache).

Answer 28 · 2017-01-12T01:00:35.000Z

Right now we use an environment variable with a list of comma-separated IDs and/or UUIDs (similar to nvidia-docker NV_GPU). It allows us to be backward compatible with all the Docker versions, encode the GPUs required by a given image (i.e. ENV NVIDIA_GPU=any or ENV NVIDIA_GPU=all) and we can override them on the command line:

docker run --runtime=nvidia -e NVIDIA_GPU=0,1 nvidia/cuda

Answer 29 · 2017-01-12T22:19:34.000Z

@crosbymichael We also hit some limitations with Docker, for example a lot of GPU images need large shmsize and memlock limits (our drivers need those). Not sure how to address that at the image level (Docker is not even relying on the OCI spec for /dev/shm).

The workaround is to configure everything at the daemon level once we have #29492 fixed but it's far from being ideal.

Answer 30 · 2017-01-12T22:24:24.000Z

You can open separate issues for shm limits etc, and we can resolve those individually.

…

On 12 Jan 2017 10:19 p.m., "Jonathan Calmels" ***@***.***> wrote: @crosbymichael <https://github.com/crosbymichael> We also hit some limitations with Docker, for example a lot of GPU images need large shmsize and memlock limits (our drivers need those). Not sure how to address that at the image level (Docker is not even relying on the spec for /dev/shm). The workaround is to configure everything at the daemon level once we have #29492 <#29492> fixed but it's far from being ideal. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#23917 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAdcPMeUb1F7lBh9XVyRQfL1Xj_bd45eks5rRqb-gaJpZM4I9U2S> .

Answer 31 · 2017-06-09T01:41:25.000Z

@justincormack @crosbymichael Do you have a timeline on the containerd 1.0 release and integration in Docker?

Right now the only option we have is to integrate at the runc level given that the containerd "runtime" is hardcoded. I would rather do it with containerd 1.0 if Docker were to support it.

Answer 32 · 2017-10-31T11:55:55.000Z

@3xxo containerd 1.0 is now integrated into Docker.

Answer 33 · 2017-11-10T08:00:38.000Z

@3XX0 thanks for your excellent works!
I got some questions about nvidia-docker.
for example, assuming a PC got tow nvidia gpu for GPU A and GPU B. And index of GPU A is 0, index of GPU B is 1.when I execute command as follows:

NV_GPU='0' nvidia-docker run -d nignx

Can this container use GPU B? By your comments above , it should not.
Since my pc got only one nvidia GPU ,so I can't make a try to confirm this.
However, I have tried this, run following commands respectively.

NV_GPU='0' nvidia-docker run -d nginx

docker run -d nginx

And I did not notice anything different between those two containers.
On both containers I can get the information of my GPU.
Did I miss something or it just goes like this.Looking forward your reply,thanks

Answer 34 · 2017-11-11T00:45:05.000Z

@WanLinghao Inside the docker run -d nginx container, you should not see the GPUs, unless you have a special configuration. Can you double-check?

Answer 35 · 2017-11-13T01:49:51.000Z

@flx42 I have tried three kinds of command
1. docker run -d nginx
2. docker run --privileged=true -d nginx
3. NV_GPU='0' nvidia-docker run -d nginx

And then I dig into their bashes respectively, execute
find / -name *nvidia*
I got results as follows：

first container and third container:

/sys/bus/pci/drivers/nvidia
/sys/kernel/slab/nvidia_pte_cache
/sys/kernel/slab/nvidia_p2p_page_cache
/sys/kernel/slab/nvidia_stack_cache
/sys/module/drm/holders/nvidia_drm
/sys/module/drm_kms_helper/holders/nvidia_drm
/sys/module/nvidia_modeset
/sys/module/nvidia_modeset/holders/nvidia_drm
/sys/module/nvidia
/sys/module/nvidia/drivers/pci:nvidia
/sys/module/nvidia/holders/nvidia_modeset
/sys/module/nvidia/holders/nvidia_uvm
/sys/module/nvidia_drm
/sys/module/nvidia_uvm
/proc/irq/33/nvidia
/proc/driver/nvidia
/proc/driver/nvidia-uvm

second container:

/dev/nvidiactl
/dev/nvidia0
/dev/nvidia-uvm-tools
/dev/nvidia-uvm
/dev/nvidia-modeset
/sys/bus/pci/drivers/nvidia
/sys/kernel/slab/nvidia_pte_cache
/sys/kernel/slab/nvidia_p2p_page_cache
/sys/kernel/slab/nvidia_stack_cache
/sys/module/drm/holders/nvidia_drm
/sys/module/drm_kms_helper/holders/nvidia_drm
/sys/module/nvidia_modeset
/sys/module/nvidia_modeset/holders/nvidia_drm
/sys/module/nvidia
/sys/module/nvidia/drivers/pci:nvidia
/sys/module/nvidia/holders/nvidia_modeset
/sys/module/nvidia/holders/nvidia_uvm
/sys/module/nvidia_drm
/sys/module/nvidia_uvm
/proc/irq/33/nvidia
/proc/driver/nvidia
/proc/driver/nvidia-uvm

As you can see, command 1 and command 3 makes no difference.
Can you give more information about this.
Looking forward your reply, THANKS!

Answer 36 · 2017-11-13T02:01:02.000Z

Ah yes, nvidia-docker (version 1.0) will be passthrough to docker for this nginx image. We enable GPU support only when used with images that are based on our nvidia/cuda images from Docker Hub. We detect if the image has a special label for this purpose.
Note that version 2.0 of nvidia-docker behaves differently (documented on our README).

So, try again with nvidia/cuda as the image.

Answer 37 · 2018-03-18T23:44:23.000Z

Any news on this? Would ❤️ a LinuxKit distro with nvidia-docker onboard 🙌✨
//cc @3XX0 @justincormack

Answer 38 · 2018-05-02T22:38:44.000Z

Trying to revive the interest for supporting OCI hooks here: #36987
If there is interest, we can close this issue once it's implemented.

Answer 39 · 2020-01-20T19:22:13.000Z

Docker 19.03 has docker run --gpus. Closing this issue.