Error: couldn't find any HIP devices
ztz0223 opened this issue · 1 comments
ztz0223 commented
Hi all,
I have 2 W6800 running on RHEL 8, I tested the gpu burn from:
https://github.com/ROCm-Developer-Tools/HIP-Examples/tree/master/gpu-burn
when I tested the binary: gpuburn-hip on physical machine, it works:
[xxx@dock2 build]$ ./gpuburn-hip -t 200
Total no. of GPUs found: 2
Init Burn Thread for device (0)
Init Burn Thread for device (1)
Burn Thread using device (0)
Burn Thread using device (1)
Temps: [GPU0: 32 C] [GPU1: 34 C] 200s
Temps: [GPU0: 37 C] [GPU1: 35 C] 199s
Temps: [GPU0: 37 C] [GPU1: 36 C] 198s
Temps: [GPU0: 37 C] [GPU1: 36 C] 197s
Temps: [GPU0: 38 C] [GPU1: 36 C] 196s
Temps: [GPU0: 38 C] [GPU1: 37 C] 195s
Temps: [GPU0: 39 C] [GPU1: 38 C] 194s
Temps: [GPU0: 39 C] [GPU1: 38 C] 193s
but I tested in the rocm/rocm-terminal container, I just got the error:
rocm-user@f90db1115a3b:/gpu-burn/build$ rocm-smi
========================= ROCm System Management Interface =========================
=================================== Concise Info ===================================
GPU Temp (DieEdge) AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU%
0 28.0c 9.0W 0Mhz 96Mhz 20.0% auto 213.0W 0% 0%
====================================================================================
=============================== End of ROCm SMI Log ================================
rocm-user@f90db1115a3b:/gpu-burn/build$
Then run the binary
rocm-user@f90db1115a3b:/gpu-burn/build$ ./gpuburn-hip
Error: couldn't find any HIP devices
rocm-user@f90db1115a3b:/gpu-burn/build$
Error: couldn't find any HIP devices
Any ideas? W6800 cannot recognized by the container?
Thanks.
ztz0223 commented
I know, the user account in the container must has the groups: video and render added