k4yt3x/video2x

Is the docker image really built with nvidia 565 drivers?

Closed this issue · 12 comments

I can't get the docker image to run anything nvidia because it complains:

Failed to initialize NVML: Driver/library version mismatch
NVML library version: 565.57

when I run things like nvidia-smi on a Linux Mint host, where the nvidia driver is 535. I even updated with nvidia official ppa but the highest I can get there is 560. I'm going to try to downgrade the containers drivers to 550 so I can stick with a recommended driver and see what happens.

I think it will work if the docker container's driver version is higher than that installed on your host. Have you installed the nvidia-container-toolkit or nvidia-docker2? For your error, have you rebooted after installing the new NVIDIA driver?

Yeah I've done both of those. I just don't think Mint's old drivers are gonna allow the newer to run. I tried with old and new nvidia/cuda tags and the old work, the new don't. I'm in no way a linux / docker / nvidia expert so I'm not a definitive answer on this. So, the only thing I know to try would be downgrading the video2x container's nvidia drivers but I know NOTHING about Arch Linux and my quick scan told me Arch Linux has ONLY the newest driver in it's repos and I didn't see any way to add a legacy repo or anything. I'm probably bailing on it for now...

redacted@machine:~/test$ docker run --gpus all -it --rm --entrypoint nvidia-smi ghcr.io/k4yt3x/video2x:6.1.1
Failed to initialize NVML: Driver/library version mismatch
NVML library version: 565.57
redacted@machine:~/test$ nvidia-smi
Sun Nov 10 12:31:59 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120                Driver Version: 550.120        CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070 ...    Off |   00000000:01:00.0  On |                  N/A |
|  0%   31C    P8             11W /  285W |     610MiB /  16376MiB |      4%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      3417      G   /usr/lib/xorg/Xorg                            237MiB |
|    0   N/A  N/A      4667      G   cinnamon                                       77MiB |
+-----------------------------------------------------------------------------------------+
redacted@machine:~/test$ docker run --gpus all -it --rm --entrypoint nvidia-smi ubuntu
Sun Nov 10 19:32:26 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120                Driver Version: 550.120        CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070 ...    Off |   00000000:01:00.0  On |                  N/A |
|  0%   31C    P8             11W /  285W |     615MiB /  16376MiB |      1%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
redacted@machine:~/test$ sudo docker run --rm --gpus all nvidia/cuda:12.6.2-base-ubuntu24.04 nvidia-smi
Unable to find image 'nvidia/cuda:12.6.2-base-ubuntu24.04' locally
12.6.2-base-ubuntu24.04: Pulling from nvidia/cuda
d1fbec07a2e5: Pull complete 
55ec83d4f55f: Pull complete 
446ca34efb63: Pull complete 
92cf3b14fc15: Pull complete 
3ee1be183cf2: Pull complete 
Digest: sha256:631ec7090c36ab846cf021073ff4a64fb9cffa90b4f9f0083799288c607073ce
Status: Downloaded newer image for nvidia/cuda:12.6.2-base-ubuntu24.04
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: requirement error: unsatisfied condition: cuda>=12.6, please update your driver to a newer version, or use an earlier cuda container: unknown.

Disregarding what nvidia-smi says, what does video2x --listgpus output?

Only CPU

redacted@machine:~/test$ docker run --gpus all -it --rm -v .:/host ghcr.io/k4yt3x/video2x:6.1.1 --listgpu
0. llvmpipe (LLVM 18.1.8, 256 bits)
	Type: CPU
	Vulkan API Version: 1.3.289
	Driver Version: 0.0.1

I see... looks like the device isn't recognized at all. Well, we've got other tickets opened about docker compatibility issues, and another issue about packaging w/ flatpak which may solve this kind of issues. For now, since you're using Mint, which is based on Ubuntu I believe, have you tried the Ubuntu deb packages?

I have and libavcodec needed is 61 while the best I could find quickly late at night was 60 and I had to go to downstream for it. So I suspected it was the first of many dependency problems I'll have. This was my first install of Mint because Ubuntu (Canonical) was pissing me off moving everything to snaps and half of the snaps didn't work. Also, I was getting a lot of stability problems. This is the first "Mint's too old" problem I've had and I've been on it for most of this year.

I've actually made a release for Ubuntu 22.04. Have you tried that yet?

Yeah, (video2x-linux-ubuntu2204-amd64.deb) it installs but I get probs running it. Maybe I'll poke at it after a cup of coffee...

redacted@machine:~/Downloads$ video2x --listgpus
video2x: error while loading shared libraries: libavcodec.so.61: cannot open shared object file: No such file or directory

Sure, lmk how it goes. If there's something I can add or tweak that'll make it compatible with Mint as well I'd be happy to do it.

OK video2x deb seems to want libavcodec_61. Max version I see in the Mint repos is 60. I'm not sure why since Mint 22 is based on Ubuntu 24.04 so the 22.04 deb should be fine. Again I'm not an expert (only and enthusiast) but are you expecting a higher version of ffmpeg?

redacted@machine:~/Downloads$ video2x --listgpus
video2x: error while loading shared libraries: libavcodec.so.61: cannot open shared object file: No such file or directory
redacted@machine:~/Downloads$ neofetch
             ...-:::::-...                 redacted@machine 
          .-MMMMMMMMMMMMMMM-.              ---------------- 
      .-MMMM`..-:::::::-..`MMMM-.          OS: Linux Mint 22 x86_64 
    .:MMMM.:MMMMMMMMMMMMMMM:.MMMM:.        Kernel: 6.8.0-48-generic 
   -MMM-M---MMMMMMMMMMMMMMMMMMM.MMM-       Uptime: 1 hour, 23 mins 
 `:MMM:MM`  :MMMM:....::-...-MMMM:MMM:`    Packages: 2781 (dpkg), 40 (flatpak) 
 :MMM:MMM`  :MM:`  ``    ``  `:MMM:MMM:    Shell: bash 5.2.21 
.MMM.MMMM`  :MM.  -MM.  .MM-  `MMMM.MMM.   Resolution: 3440x1440, 1920x1080 
:MMM:MMMM`  :MM.  -MM-  .MM:  `MMMM-MMM:   DE: Cinnamon 6.2.9 
:MMM:MMMM`  :MM.  -MM-  .MM:  `MMMM:MMM:   WM: Mutter (Muffin) 
:MMM:MMMM`  :MM.  -MM-  .MM:  `MMMM-MMM:   WM Theme: Mint-Y-Dark (Mint-Y) 
.MMM.MMMM`  :MM:--:MM:--:MM:  `MMMM.MMM.   Theme: Mint-Y [GTK2/3] 
 :MMM:MMM-  `-MMMMMMMMMMMM-`  -MMM-MMM:    Icons: Mint-Y [GTK2/3] 
  :MMM:MMM:`                `:MMM:MMM:     Terminal: gnome-terminal 
   .MMM.MMMM:--------------:MMMM.MMM.      CPU: Intel i7-6700K (8) @ 4.200GHz 
     '-MMMM.-MMMMMMMMMMMMMMM-.MMMM-'       GPU: NVIDIA GeForce RTX 4070 Ti SUPER 
       '.-MMMM``--:::::--``MMMM-.'         Memory: 4398MiB / 32037MiB 
            '-MMMMMMMMMMMMM-'
               ``-:::::-``                                         
redacted@machine:~/Downloads$ apt-cache depends video2x
video2x
  Depends: ffmpeg
    ffmpeg:i386
  Depends: libvulkan1
  Depends: libboost-program-options1.74.0
redacted@machine:~/Downloads$ apt-cache depends ffmpeg
ffmpeg
  Depends: libavcodec60
    libavcodec-extra60
  Depends: libavdevice60
  Depends: libavfilter9
    libavfilter-extra9
  Depends: libavformat60
    libavformat-extra60
  Depends: libavutil58
  Depends: libc6
  Depends: libpostproc57
  Depends: libsdl2-2.0-0
  Depends: libswresample4
  Depends: libswscale7
  Suggests: ffmpeg-doc
redacted@machine:~/Downloads$ apt-cache depends libvulkan1
libvulkan1
  Depends: libc6
  Breaks: libvulkan-dev
  Breaks: <vulkan-loader>
 |Recommends: mesa-vulkan-drivers
  Recommends: <vulkan-icd>
    mesa-vulkan-drivers
  Replaces: libvulkan-dev
  Replaces: <vulkan-loader>
redacted@machine:~/Downloads$ apt-cache depends libboost-program-options1.74.0
libboost-program-options1.74.0
  Depends: libc6
  Depends: libgcc-s1
  Depends: libstdc++6
redacted@machine:~/Downloads$ dpkg-query --list ffmpeg
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version          Architecture Description
+++-==============-================-============-================================================================
ii  ffmpeg         7:6.1.1-3ubuntu5 amd64        Tools for transcoding, streaming and playing of multimedia files

Ah sorry it's something I forgot to mention. The version of FFmepg on Ubuntu 22.04 is too old and it's not compiled with libplacebo. I had to use a PPA and compile with FFmpeg 7 for Colab:

sudo add-apt-repository -y ppa:ubuntuhandbook1/ffmpeg7
sudo apt-get update
sudo apt-get install ffmpeg

Then Video2X should install and run with RealESRGAN. I still couldn't get libplacebo to work for Colab though.

Success! I added the UbuntuHandbook ppa for ffmpeg,

sudo add-apt-repository ppa:ubuntuhandbook1/ffmpeg7

and upgraded ffmpeg (which I noticed brought libavcodec61 along with it) and

redacted@machine:~$ video2x --listgpus
0. NVIDIA GeForce RTX 4070 Ti SUPER
	Type: Discrete GPU
	Vulkan API Version: 1.3.277
	Driver Version: 550.480.0
1. llvmpipe (LLVM 17.0.6, 256 bits)
	Type: CPU
	Vulkan API Version: 1.3.274
	Driver Version: 0.0.1

This is with the deb installation. I'll give it a whirl and see if it makes it through an upscale. I'll also watch for anything that breaks with the updated libs and let you know. Thanks for checking in and chatting.