elFarto/nvidia-vaapi-driver

No hardware decoding - Nvidia 545.29.02 / Wayland

Closed this issue · 12 comments

I haven't had working video decode for a few driver versions now and I haven't been able to figure out why.
I'm using EndeavourOS (Arch) with the latest Firefox and standard ffmpeg package.

nvidia-smi:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.02              Driver Version: 545.29.02    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA TITAN Xp                On  | 00000000:08:00.0  On |                  N/A |
|  0%   21C    P8              17W / 300W |    478MiB / 12288MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

NVD_LOG=1 vainfo

Trying display: wayland
      5112.386306102 [9056-9056] ../nvidia-vaapi-driver-0.0.11/src/vabackend.c:2140       __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 40
      5112.386336403 [9056-9056] ../nvidia-vaapi-driver-0.0.11/src/vabackend.c:2149       __vaDriverInit_1_0 Now have 0 (0 max) instances
      5112.386341353 [9056-9056] ../nvidia-vaapi-driver-0.0.11/src/vabackend.c:2175       __vaDriverInit_1_0 Selecting Direct backend
      5112.390638852 [9056-9056] ../nvidia-vaapi-driver-0.0.11/src/direct/direct-export-buf.c:  85      direct_initExporter Found NVIDIA GPU 0 at /dev/dri/renderD128
      5112.390652712 [9056-9056] ../nvidia-vaapi-driver-0.0.11/src/direct/nv-driver.c: 246            init_nvdriver Initing nvdriver...
      5112.390671752 [9056-9056] ../nvidia-vaapi-driver-0.0.11/src/direct/nv-driver.c: 264            init_nvdriver NVIDIA kernel driver version: 545.29.02, major version: 545, minor version: 29
      5112.390678192 [9056-9056] ../nvidia-vaapi-driver-0.0.11/src/direct/nv-driver.c: 271            init_nvdriver Got dev info: 800 1 0 fe
vainfo: VA-API version: 1.20 (libva 2.20.0)
vainfo: Driver version: VA-API NVDEC driver [direct backend]
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileHEVCMain               :	VAEntrypointVLD
      VAProfileVP9Profile0            :	VAEntrypointVLD
      VAProfileHEVCMain10             :	VAEntrypointVLD
      VAProfileHEVCMain12             :	VAEntrypointVLD
      VAProfileVP9Profile2            :	VAEntrypointVLD
      5112.453942730 [9056-9056] ../nvidia-vaapi-driver-0.0.11/src/vabackend.c:2050              nvTerminate Terminating 0x562075cfa790
      5112.454029520 [9056-9056] ../nvidia-vaapi-driver-0.0.11/src/vabackend.c:2064              nvTerminate Now have 0 (0 max) instances

about:config:

media.ffmpeg.vaapi.enabled=true
media.rdd-ffmpeg.enabled=true
media.av1.enabled=false
gfx.x11-egl.force-enabled=true
widget.dmabuf.force-enabled=true

Additionally, the following environment variables are set in ~/.config/environment.d/moz.conf:

MOZ_ENABLE_WAYLAND=1
MOZ_DISABLE_RDD_SANDBOX=1
MOZ_DRM_DEVICE=⁄dev⁄dri⁄renderD128
EGL_PLATFORM=wayland
NVD_GPU=0

Though setting them in /etc/environment doesn't seem to make a difference.

/etc/environment also contains:

LIBVA_DRIVER_NAME=nvidia
NVD_BACKEND=direct

But setting these in the environment.d conf file also makes no difference.

Running NVD_LOG=1 firefox outputs nothing to the console - it's immediately ready for the next command.
I feel like I'm missing something simple but I'm not sure what. Thanks in advance to anyone who tries to help me figure this out.

Same here. Hasn't worked for a few driver versions including 545

is your ffmpeg version compiled with vaapi support? you can check this with ffmpeg -hwaccels the output should include vaapi.
also, just adding the various variables to the enviroment files doesn't do anything until you log out and back in again.

@rkoot Thanks for the suggestions, I can confirm vaapi is listed under hardware acceleration methods and reboots did occur between switching where the environment variables were set.

@Dirleye Pending a better solution I found a workaround. I could not make it work with wayland but it did work in X11 with direct backend.
MOZ_DISABLE_RDD_SANDBOX=1 LIBVA_DRIVER_NAME=nvidia NVD_BACKEND=direct firefox

@Dirleye Pending a better solution I found a workaround. I could not make it work with wayland but it did work in X11 with direct backend. MOZ_DISABLE_RDD_SANDBOX=1 LIBVA_DRIVER_NAME=nvidia NVD_BACKEND=direct firefox

Works fine for me in Wayland with same prerequisites

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.02              Driver Version: 545.29.02    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+

But with a little difference in the launch string:
NVD_BACKEND=direct MOZ_ENABLE_WAYLAND=1 EGL_PLATFORM=wayland MOZ_DISABLE_RDD_SANDBOX=1 LIBVA_DRIVER_NAME=nvidia firefox

MOZ_ENABLE_WAYLAND will be unnecessary since 121
(I'm not sure that we still need to disable RDD sandboxing in newer versions of Firefox, cause it seems to be fixed)

NVD_BACKEND=direct MOZ_ENABLE_WAYLAND=1 EGL_PLATFORM=wayland MOZ_DISABLE_RDD_SANDBOX=1 LIBVA_DRIVER_NAME=nvidia firefox

Copying and pasting this to make sure everything is set up correctly still doesn't work for me. @m00r3ik what GPU are you using?

If it's working in Wayland on this driver for others, there's probably something set somewhere that's interfering somehow unless it's a generational thing with the GPUs.

@m00r3ik what GPU are you using?

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.02              Driver Version: 545.29.02    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 2060        Off | 00000000:09:00.0  On |                  N/A |
| 32%   34C    P8              13W / 160W |    899MiB /  6144MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

Also have this one (using Gentoo)

VDPAU_DRIVER=nvidia
#GST_VAAPI_ALL_DRIVERS=1
LIBVA_DRIVERS_PATH=/usr/lib64/dri
LIBGL_DRIVERS_PATH=/usr/lib64/dri
GBM_BACKENDS_PATH=/usr/lib64/gbm
 $ file /usr/lib64/gbm/nvidia-drm_gbm.so 
/usr/lib64/gbm/nvidia-drm_gbm.so: symbolic link to ../libnvidia-allocator.so.1

My symlink is a bit more direct but they lead to the same file so there shouldn't be a difference there.
'''
file /usr/lib64/gbm/nvidia-drm_gbm.so
/usr/lib64/gbm/nvidia-drm_gbm.so: symbolic link to ../libnvidia-allocator.so.545.29.02
'''

Adding those environment variables didn't help either. Perhaps it's a Turing vs Pascal thing? Thank you all for the suggestions so far.

Adding those environment variables didn't help either. Perhaps it's a Turing vs Pascal thing? Thank you all for the suggestions so far.

Maybe, but I don't think so, because all ok in 980Ti too
And the last question.
Perhaps you are using a flatpak/snap-installed version of Firefox?
In this case, this driver may not work

Perhaps you are using a flatpak/snap-installed version of Firefox? In this case, this driver may not work

I'm using the version from the standard repos with Pacman so no issues there. Once I get some time I'll make a fresh install on another SSD and do nothing but set this up to rule out something external to this being the problem.

Can you post a log of Firefox playing back a video with NVD_LOG=1 set? (make sure you don't have any other Firefox instances running when you start it, or it'll just reuse them and ignore the NVD_LOG environment variable).

Sorry, I missed that you've already done that and didn't see anything. Can you try running Firefox with MOZ_LOG="PlatformDecoderModule:5,Dmabuf:5" set instead?

Here is the log: log.txt

D/Dmabuf Failed to open drm render node ⁄dev⁄dri⁄renderD128 error No such file or directory is odd since it definitely exists.

I'll try again without MOZ_DRM_DEVICE=⁄dev⁄dri⁄renderD128 set.

Edit: Removing MOZ_DRM_DEVICE=⁄dev⁄dri⁄renderD128 from my environment variables allows Firefox to correctly set the device to /dev/dri/renderD128 (???) and now both H264 and VP9 are working again via NVDEC.

I have absolutely no idea why that environment variable broke it but please do let me know if you find out.
Thank you everyone for your suggestions once more.