Atinoda/text-generation-webui-docker

Error response from daemon: could not select device driver "nvidia" with capabilities:

dewijones92 opened this issue · 5 comments

no luck for me when trying to use this. Am I missing something? thanks

(base) dewi@DewiJones:~/code/text-generation-webui-docker/text-generation-webui-docker$ gs
++ pwd
+ current_dir=/home/dewi/code/text-generation-webui-docker/text-generation-webui-docker
+ [[ /home/dewi/code/text-generation-webui-docker/text-generation-webui-docker == \/\m\n\t\/\c* ]]
+ /usr/bin/git status -v -v
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   docker-compose.yml

--------------------------------------------------
Changes not staged for commit:
diff --git i/docker-compose.yml w/docker-compose.yml
index d1caff0..33dda5e 100644
--- i/docker-compose.yml
+++ w/docker-compose.yml
@@ -1,7 +1,7 @@
 version: "3"
 services:
   text-generation-webui-docker:
-    image: atinoda/text-generation-webui:default # Specify variant as the :tag
+    image: atinoda/text-generation-webui:llama-cpu # Specify variant as the :tag
     container_name: text-generation-webui
     environment:
       - EXTRA_LAUNCH_ARGS="--listen --verbose" # Custom launch args (e.g., --model MODEL_NAME)
no changes added to commit (use "git add" and/or "git commit -a")
git status
commit d4b58daffec5096e2a7057388420e74987537766 (HEAD -> master, origin/master, origin/HEAD)
Author: Atinoda <61033436+Atinoda@users.noreply.github.com>
Date:   Wed Oct 18 15:49:48 2023 +0100

    Separate nightly builds
(base) dewi@DewiJones:~/code/text-generation-webui-docker/text-generation-webui-docker$ docker compose up
Attaching to text-generation-webui
Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]
(base) dewi@DewiJones:~/code/text-generation-webui-docker/text-generation-webui-docker$

Are you able to run other docker images that require CUDA? Error message seems to say that you cannot access the GPU hardware.

I just noticed that you are trying to run the llama-cpu variant, please see #9 and #16 for relevant information. I will leave this open as a reminder for me to update the documentation with expanded instructions for CPU inference.

TLDR: Comment out the deploy: block in the docker-compose.yml

Hi,
In my case also getting the same error when I'm trying to run the docker container using the below command

'docker run --gpus all image-id'

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Basically I have Created VM using the default Amzon AMI which is verified by Amazon

These are AMI details

GPU (Kernel 4.14)
AMI name: amzn2-ami-ecs-gpu-hvm-2.0.20231103-x86_64-ebs
ECS Agent version: 1.79.0
Docker version: 20.10.25
Containerd version: 1.6.19
NVIDIA driver version: 535.54.03
CUDA version: 12.2.0
Source AMI name: amzn2-ami-minimal-hvm-2.0.20230926.0-x86_64-ebs

I'm using the below commands to erase the old nvidia-driver (535.54.03)and trying to install new nvidia-driver(535.129.03) version with below commands which are given in aws documentation

sudo yum remove nvidia*
sudo yum remove cuda*
sudo yum erase nvidia cuda
sudo yum update -y
sudo amazon-linux-extras install kernel-5.15
sudo yum install gcc make && sudo yum update -y
sudo reboot
sudo yum install -y gcc kernel-devel-$(uname -r)
chmod +x NVIDIA-Linux-x86_64*.run
sudo CC=/usr/bin/gcc10-cc ./NVIDIA-Linux-x86_64*.run
sudo touch /etc/modprobe.d/nvidia.conf
echo "options nvidia NVreg_EnableGpuFirmware=0" | sudo tee --append /etc/modprobe.d/nvidia.conf
sudo reboot

After following the Above commands I'm able to upgrade nvidia-driver version to 535.129.03 And kernel also I'm able to upgrade to 5.15,
But when I'm Running docker container facing the above mentioned issue.

Any Suggestions?

@shaiksuhel1999 you need to install nvidia-ctk and nvidia-container-runtime if the first package doesnt come with it, your docker daemon.json you need to put in the following

{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

Closing this issue because the docker-compose.yml now has a comment indicating that the deploy: section should be commented out for non-Nvidia inferencing.