8 Nvidia GPUs: No GPU to monitor.
Closed this issue · 1 comments
MathieuMoalic commented
Using NixOS, we have a bunch of GPUs in a workstation. I have the same NixOS build running without issue on another workstation ( with different hardware, but with Nvidia GPUs anyway ).
nvtop version 3.1.0
ldd $(which nvidia-smi)
linux-vdso.so.1 (0x00007ffcb89ca000)
libpthread.so.0 => /nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5/lib/libpthread.so.0 (0x00007f6901d06000)
libm.so.6 => /nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5/lib/libm.so.6 (0x00007f6901c23000)
libdl.so.2 => /nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5/lib/libdl.so.2 (0x00007f6901c1e000)
libc.so.6 => /nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5/lib/libc.so.6 (0x00007f6901a2f000)
librt.so.1 => /nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5/lib/librt.so.1 (0x00007f6901a2a000)
/nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5/lib/ld-linux-x86-64.so.2 => /nix/store/ddwyrxif62r8n6xclvskjyy6szdhvj60-glibc-2.39-5/lib64/ld-linux-x86-64.so.2 (0x00007f6901d0d000)
From this other issue, I show that I have libnvidia-ml.so
installed:
sudo find /nix/store -type f -name 'libnvidia-ml.so'
/nix/store/fg41wbwiiygf6vkinsqz1y8dniif48y3-cudatoolkit-12.2.2/lib/stubs/libnvidia-ml.so
nvidia-smi
Fri Apr 26 08:48:12 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67 Driver Version: 550.67 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4080 Off | 00000000:01:00.0 Off | N/A |
| 0% 24C P8 2W / 320W | 2MiB / 16376MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA GeForce RTX 4080 Off | 00000000:02:00.0 Off | N/A |
| 0% 28C P8 2W / 320W | 2MiB / 16376MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA GeForce RTX 4080 Off | 00000000:03:00.0 Off | N/A |
| 0% 29C P8 2W / 320W | 2MiB / 16376MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA GeForce RTX 4080 Off | 00000000:04:00.0 Off | N/A |
| 0% 29C P8 2W / 320W | 2MiB / 16376MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 4 NVIDIA GeForce RTX 4080 ... Off | 00000000:06:00.0 Off | N/A |
| 0% 30C P8 7W / 320W | 2MiB / 16376MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 5 NVIDIA GeForce RTX 4080 Off | 00000000:07:00.0 Off | N/A |
| 0% 29C P8 6W / 320W | 2MiB / 16376MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 6 NVIDIA GeForce RTX 4080 Off | 00000000:08:00.0 Off | N/A |
| 0% 28C P8 3W / 320W | 2MiB / 16376MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 7 NVIDIA GeForce RTX 4080 Off | 00000000:09:00.0 Off | N/A |
| 0% 27C P8 2W / 320W | 2MiB / 16376MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
❯ nvtop
No GPU to monitor.
MathieuMoalic commented
I installed a desktop environment and it fixed the issue. Probably some dependencies were missing and the error did not make that clear.