NVlabs/FoundationPose

Support for RTX 4090

shingarey opened this issue ยท 11 comments

The current Dockerfile uses an image with outdated CUDA library cudagl:11.3.0-devel which does not support RTX 4090. To fix this, the docker image and torch version in the Dockerfile can be updated. Following changes worked for me.

  1. Update docker image and torch version in docker/dockerfile:
FROM nvidia/cuda:12.1.0-devel-ubuntu20.04

...

RUN conda init bash &&\
    echo "conda activate my" >> ~/.bashrc &&\
    conda activate my &&\
    pip install torchvision==0.16.0+cu121 torchaudio==2.1.0 torch==2.1.0+cu121 --index-url https://download.pytorch.org/whl/cu121 &&\
...
  1. Additionally, update the compiler flags in bundlesdf/mycuda/setup.py from c++14 to c++17:
nvcc_flags = ['-Xcompiler', '-O3', '-std=c++17', '-U__CUDA_NO_HALF_OPERATORS__', '-U__CUDA_NO_HALF_CONVERSIONS__', '-U__CUDA_NO_HALF2_OPERATORS__']
c_flags = ['-O3', '-std=c++17']

With these fixes, I was able to successfully build and run the provided examples: run_demo.py and bundlesdf/run_nerf.py.

Perfect, thanks for sharing this!

@shingarey could you share the docker environment for 4090? For some reason, I failed to build docker from dockerfile. It would be great to download the environment by docker pull

@shingarey I pulled your docker but could you please share how you incorporate it into run_container.sh?

(base) mona@ada:/data/FoundationPose$ docker pull shingarey/foundationpose_custom_cuda121:latest
latest: Pulling from shingarey/foundationpose_custom_cuda121
96d54c3075c9: Pull complete 
755e535b54a3: Pull complete 
24ff69e0a1e4: Pull complete 
76a627ca5e65: Pull complete 
35817692a87e: Pull complete 
84b6a42e847a: Pull complete 
4639b7cd68e5: Pull complete 
a7bc10701a5b: Pull complete 
7d32a9230f8f: Pull complete 
8c35a1861813: Pull complete 
e3f63d7242f5: Pull complete 
7d4b097a7e9d: Pull complete 
4612d657f431: Pull complete 
730e1cd9723b: Pull complete 
b4a75e4ece39: Pull complete 
fe0293ccd76c: Pull complete 
ed1a8048ef98: Pull complete 
1308be50bc97: Pull complete 
4412cb55cd12: Pull complete 
013df3150cc3: Pull complete 
8441b9a1ff80: Pull complete 
ad30346aeadc: Pull complete 
9c24ef1ac2af: Pull complete 
Digest: sha256:288252092889a52a2e3f1c0087e4a380a601beedf54615c18312ab59cf1f3fb5
Status: Downloaded newer image for shingarey/foundationpose_custom_cuda121:latest
docker.io/shingarey/foundationpose_custom_cuda121:latest

(base) mona@ada:/data/FoundationPose/docker$ cat run_container.sh
docker rm -f foundationpose
CATGRASP_DIR=$(pwd)/../
xhost +  && docker run --gpus all --env NVIDIA_DISABLE_REQUIRE=1 -it --network=host --name foundationpose shingarey/foundationpose_custom_cuda121:latest  --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v /data:/data -v /mnt:/mnt -v /tmp/.X11-unix:/tmp/.X11-unix -v /tmp:/tmp  --ipc=host -e DISPLAY=${DISPLAY} -e GIT_INDEX_FILE foundationpose:latest bash

(base) mona@ada:/data/FoundationPose/docker$ bash run_container.sh 
fp
access control disabled, clients can connect from any host

==========
== CUDA ==
==========

CUDA Version 12.1.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

*************************
** DEPRECATION NOTICE! **
*************************
THIS IMAGE IS DEPRECATED and is scheduled for DELETION.
    https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/support-policy.md

/opt/nvidia/nvidia_entrypoint.sh: line 67: exec: --: invalid option
exec: usage: exec [-cl] [-a name] [command [arguments ...]] [redirection ...]

(base) mona@ada:/data/FoundationPose/docker$ bash run_container.sh
foundationpose
access control disabled, clients can connect from any host

==========
== CUDA ==
==========

CUDA Version 12.1.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

*************************
** DEPRECATION NOTICE! **
*************************
THIS IMAGE IS DEPRECATED and is scheduled for DELETION.
    https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/support-policy.md

/opt/nvidia/nvidia_entrypoint.sh: line 67: exec: --: invalid option
exec: usage: exec [-cl] [-a name] [command [arguments ...]] [redirection ...]
(base) mona@ada:/data/FoundationPose/docker$ cat run_container.sh
docker rm -f foundationpose
CATGRASP_DIR=$(pwd)/../
xhost +  && docker run --gpus all --env NVIDIA_DISABLE_REQUIRE=1 -it --network=host --name foundationpose docker.io/shingarey/foundationpose_custom_cuda121:latest  --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v /data:/data -v /mnt:/mnt -v /tmp/.X11-unix:/tmp/.X11-unix -v /tmp:/tmp  --ipc=host -e DISPLAY=${DISPLAY} -e GIT_INDEX_FILE foundationpose:latest bash

docker rm -f foundationpose
CATGRASP_DIR=$(pwd)/../
xhost +  && docker run --gpus all --env NVIDIA_DISABLE_REQUIRE=1 -it --network=host --name foundationpose  --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v __PATH_TO_FOUNDATIONPOSE_ON_HOST_PC__:__PATH_ON_CONTAINER__ -v /mnt:/mnt -v /tmp/.X11-unix:/tmp/.X11-unix -v /tmp:/tmp  --ipc=host -e DISPLAY=${DISPLAY} -e GIT_INDEX_FILE foundationpose:latest bash

__PATH_TO_FOUNDATIONPOSE_ON_HOST_PC__ - is the path to the directory on your host system
__PATH_ON_CONTAINER__ - where to mount inside the container, e.g. /home/your_name

foundationpose:latest might need to be adapted if you use different image.

PATH_TO_FOUNDATIONPOSE_ON_HOST_PC Is this path referring to the path of FoundationPose-main? I don't quite understand, thank you.

@shingarey

can you show all the steps including the docker pull?
do you use same command as below?
docker pull shingarey/foundationpose_custom_cuda121:latest

Yes, it refers to the path to the FoundationPose directory on your host PC.
I use it as described in the initial post.

Thank you for your reply, it is very helpful to me.

What are the reasons for not making a PR and making this update available in the main branch? Why keep it at 11.3? Also, cudagl and cuda images are at different docker repos. CudaGL adds support for OpenGL, but doesn't have images past cuda 11.4. Do we need OpenGL support?

Thank you very much for your discussion. In the end, I resolved the issue using the following commanddocker rm -f foundationpose DIR=$(pwd)/..// xhost + && docker run --gpus all --env NVIDIA_DISABLE_REQUIRE=1 -it --network=host --name foundationpose --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v $DIR:$DIR -v /mnt:/mnt -v /tmp/.X11-unix:/tmp/.X11-unix -v /tmp:/tmp --ipc=host -e DISPLAY=${DISPLAY} -e GIT_INDEX_FILE shingarey/foundationpose_custom_cuda121:latest bash -c "cd $DIR && bash"