ROCm/HIP

[Issue]: hipErrorNoDevice error when printing `torch.cuda.is_available()`

Closed this issue · 3 comments

Problem Description

Hi.
I use the ROCm 6.2 on wsl ubuntu 24.04. After I install pytorch rocm version under the instruction, I tried to print torch.cuda.is_available(), but it returned False.

When I used AMD_LOG_LEVEL=7 to print the additional debug information. I received the following message:

$ AMD_LOG_LEVEL=7 python3 -c 'import torch; print(torch.cuda.is_available())'
:3:rocdevice.cpp            :468 : 0518006503 us: [pid:858   tid:0x7f19cb128080] Initializing HSA stack.
:4:runtime.cpp              :85  : 0518006987 us: [pid:858   tid:0x7f19cb128080] init
:3:hip_context.cpp          :49  : 0518006999 us: [pid:858   tid:0x7f19cb128080] Direct Dispatch: 1
:3:hip_device_runtime.cpp   :651 : 0518007009 us: [pid:858   tid:0x7f19cb128080]  hipGetDeviceCount ( 0x7fff0ec5b3c8 )
:3:hip_device_runtime.cpp   :653 : 0518007026 us: [pid:858   tid:0x7f19cb128080] hipGetDeviceCount: Returned hipErrorNoDevice :
:3:hip_error.cpp            :36  : 0518007039 us: [pid:858   tid:0x7f19cb128080]  hipGetLastError (  )
:3:hip_error.cpp            :36  : 0518007049 us: [pid:858   tid:0x7f19cb128080] hipGetLastError: Returned hipErrorNoDevice :
False
:3:hip_device_runtime.cpp   :620 : 0518095352 us: [pid:858   tid:0x7f19cb128080]  hipDeviceSynchronize (  )
:3:hip_device_runtime.cpp   :620 : 0518095390 us: [pid:858   tid:0x7f19cb128080] hipDeviceSynchronize: Returned hipErrorNoDevice :
:1:hip_platform.cpp         :182 : 0518095395 us: [pid:858   tid:0x7f19cb128080] Error during hipDeviceSynchronize, error: 100
:3:hip_device_runtime.cpp   :620 : 0518097321 us: [pid:858   tid:0x7f19cb128080]  hipDeviceSynchronize (  )
:3:hip_device_runtime.cpp   :620 : 0518097343 us: [pid:858   tid:0x7f19cb128080] hipDeviceSynchronize: Returned hipErrorNoDevice :
:1:hip_platform.cpp         :182 : 0518097346 us: [pid:858   tid:0x7f19cb128080] Error during hipDeviceSynchronize, error: 100
:3:hip_device_runtime.cpp   :620 : 0518099229 us: [pid:858   tid:0x7f19cb128080]  hipDeviceSynchronize (  )
:3:hip_device_runtime.cpp   :620 : 0518099253 us: [pid:858   tid:0x7f19cb128080] hipDeviceSynchronize: Returned hipErrorNoDevice :
:1:hip_platform.cpp         :182 : 0518099256 us: [pid:858   tid:0x7f19cb128080] Error during hipDeviceSynchronize, error: 100
:3:hip_device_runtime.cpp   :620 : 0518100459 us: [pid:858   tid:0x7f19cb128080]  hipDeviceSynchronize (  )
:3:hip_device_runtime.cpp   :620 : 0518100481 us: [pid:858   tid:0x7f19cb128080] hipDeviceSynchronize: Returned hipErrorNoDevice :
:1:hip_platform.cpp         :182 : 0518100484 us: [pid:858   tid:0x7f19cb128080] Error during hipDeviceSynchronize, error: 100
:3:hip_device_runtime.cpp   :620 : 0518101450 us: [pid:858   tid:0x7f19cb128080]  hipDeviceSynchronize (  )
:3:hip_device_runtime.cpp   :620 : 0518101471 us: [pid:858   tid:0x7f19cb128080] hipDeviceSynchronize: Returned hipErrorNoDevice :
:1:hip_platform.cpp         :182 : 0518101474 us: [pid:858   tid:0x7f19cb128080] Error during hipDeviceSynchronize, error: 100
:3:hip_device_runtime.cpp   :620 : 0518102353 us: [pid:858   tid:0x7f19cb128080]  hipDeviceSynchronize (  )
:3:hip_device_runtime.cpp   :620 : 0518102374 us: [pid:858   tid:0x7f19cb128080] hipDeviceSynchronize: Returned hipErrorNoDevice :
:1:hip_platform.cpp         :182 : 0518102377 us: [pid:858   tid:0x7f19cb128080] Error during hipDeviceSynchronize, error: 100
:3:hip_device_runtime.cpp   :620 : 0518103070 us: [pid:858   tid:0x7f19cb128080]  hipDeviceSynchronize (  )
:3:hip_device_runtime.cpp   :620 : 0518103091 us: [pid:858   tid:0x7f19cb128080] hipDeviceSynchronize: Returned hipErrorNoDevice :
:1:hip_platform.cpp         :182 : 0518103094 us: [pid:858   tid:0x7f19cb128080] Error during hipDeviceSynchronize, error: 100
:3:hip_device_runtime.cpp   :620 : 0518103812 us: [pid:858   tid:0x7f19cb128080]  hipDeviceSynchronize (  )
:3:hip_device_runtime.cpp   :620 : 0518103832 us: [pid:858   tid:0x7f19cb128080] hipDeviceSynchronize: Returned hipErrorNoDevice :
:1:hip_platform.cpp         :182 : 0518103835 us: [pid:858   tid:0x7f19cb128080] Error during hipDeviceSynchronize, error: 100
:3:hip_device_runtime.cpp   :620 : 0518104532 us: [pid:858   tid:0x7f19cb128080]  hipDeviceSynchronize (  )
:3:hip_device_runtime.cpp   :620 : 0518104551 us: [pid:858   tid:0x7f19cb128080] hipDeviceSynchronize: Returned hipErrorNoDevice :
:1:hip_platform.cpp         :182 : 0518104554 us: [pid:858   tid:0x7f19cb128080] Error during hipDeviceSynchronize, error: 100

... (The rest is repeating "Error during hipDeviceSynchronize, error: 100" etc.)

I tried to add my current user to render and video group, and reboot the wsl. But the situation is the same(my username is sam:

$ getent group video
video:x:44:sam

$ getent group render
video:x:44:sam

Does anyone have any thoughts on this? :)

Operating System

Ubuntu 24.04.1 LTS (Noble Numbat) (WSL2)

 wsl -l -v
  NAME            STATE           VERSION
* Ubuntu-24.04    Running         2

CPU

13th Gen Intel(R) Core(TM) i5-13600K

GPU

AMD Radeon RX 7900 GRE

ROCm Version

ROCm 6.2.3

ROCm Component

ROCm

Steps to Reproduce

  1. install amdgpu : sudo apt install ./amdgpu-install_6.2.60203-1_all.deb
  2. install the ROCm: amdgpu-install -y --usecase=wsl,rocm --no-dkms
  3. install pytorch : pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2
  4. Check installation: AMD_LOG_LEVEL=7 python3 -c 'import torch; print(torch.cuda.is_available())'

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

WSL environment detected.
=====================
HSA System Attributes
=====================
Runtime Version:         1.1
Runtime Ext Version:     1.6
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE
System Endianness:       LITTLE
Mwaitx:                  DISABLED
DMAbuf Support:          NO

==========
HSA Agents
==========
*******
Agent 1
*******
  Name:                    CPU
  Uuid:                    CPU-XX
  Marketing Name:          CPU
  Vendor Name:             CPU
  Feature:                 None specified
  Profile:                 FULL_PROFILE
  Float Round Mode:        NEAR
  Max Queue Number:        0(0x0)
  Queue Min Size:          0(0x0)
  Queue Max Size:          0(0x0)
  Queue Type:              MULTI
  Node:                    0
  Device Type:             CPU
  Cache Info:
  Chip ID:                 0(0x0)
  Cacheline Size:          64(0x40)
  Internal Node ID:        0
  Compute Unit:            20
  SIMDs per CU:            0
  Shader Engines:          0
  Shader Arrs. per Eng.:   0
  Memory Properties:
  Features:                None
  Pool Info:
    Pool 1
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    32758112(0x1f3d960) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:4KB
      Alloc Alignment:         4KB
      Accessible by all:       TRUE
    Pool 2
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED
      Size:                    32758112(0x1f3d960) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:4KB
      Alloc Alignment:         4KB
      Accessible by all:       TRUE
  ISA Info:
*******
Agent 2
*******
  Name:                    gfx1100
  Marketing Name:          AMD Radeon RX 7900 GRE
  Vendor Name:             AMD
  Feature:                 KERNEL_DISPATCH
  Profile:                 BASE_PROFILE
  Float Round Mode:        NEAR
  Max Queue Number:        16(0x10)
  Queue Min Size:          4096(0x1000)
  Queue Max Size:          131072(0x20000)
  Queue Type:              MULTI
  Node:                    1
  Device Type:             GPU
  Cache Info:
    L1:                      32(0x20) KB
    L2:                      6144(0x1800) KB
    L3:                      65536(0x10000) KB
  Chip ID:                 29772(0x744c)
  Cacheline Size:          64(0x40)
  Max Clock Freq. (MHz):   2052
  Internal Node ID:        1
  Compute Unit:            80
  SIMDs per CU:            2
  Shader Engines:          6
  Shader Arrs. per Eng.:   2
  Coherent Host Access:    FALSE
  Memory Properties:
  Features:                KERNEL_DISPATCH
  Fast F16 Operation:      TRUE
  Wavefront Size:          32(0x20)
  Workgroup Max Size:      1024(0x400)
  Workgroup Max Size per Dimension:
    x                        1024(0x400)
    y                        1024(0x400)
    z                        1024(0x400)
  Max Waves Per CU:        32(0x20)
  Max Work-item Per CU:    1024(0x400)
  Grid Max Size:           4294967295(0xffffffff)
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)
    y                        4294967295(0xffffffff)
    z                        4294967295(0xffffffff)
  Max fbarriers/Workgrp:   32
  Packet Processor uCode:: 2280
  SDMA engine uCode::      21
  IOMMU Support::          None
  Pool Info:
    Pool 1
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED
      Size:                    16691368(0xfeb0a8) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:2048KB
      Alloc Alignment:         4KB
      Accessible by all:       FALSE
    Pool 2
      Segment:                 GROUP
      Size:                    64(0x40) KB
      Allocatable:             FALSE
      Alloc Granule:           0KB
      Alloc Recommended Granule:0KB
      Alloc Alignment:         0KB
      Accessible by all:       FALSE
  ISA Info:
    ISA 1
      Name:                    amdgcn-amd-amdhsa--gfx1100
      Machine Models:          HSA_MACHINE_MODEL_LARGE
      Profiles:                HSA_PROFILE_BASE
      Default Rounding Mode:   NEAR
      Default Rounding Mode:   NEAR
      Fast f16:                TRUE
      Workgroup Max Size:      1024(0x400)
      Workgroup Max Size per Dimension:
        x                        1024(0x400)
        y                        1024(0x400)
        z                        1024(0x400)
      Grid Max Size:           4294967295(0xffffffff)
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)
        y                        4294967295(0xffffffff)
        z                        4294967295(0xffffffff)
      FBarrier Max Size:       32
*** Done ***

Additional Information

No response

cjatin commented

It seems you are running rocm through WSL, can you share your windows driver version as well

It seems you are running rocm through WSL, can you share your windows driver version as well

Certainly.
The current Windows driver version is 24.12.1

Hi.
I thought I missed the document.

Previously I installed PyTorch using the command directly. It used the original pytorch wheel, so it can not work.

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2

The correct way to install:

wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.2.3/torch-2.3.0%2Brocm6.2.3-cp310-cp310-linux_x86_64.whl
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.2.3/torchvision-0.18.0%2Brocm6.2.3-cp310-cp310-linux_x86_64.whl
wget https://repo.radeon.com/rocm/manylinux/rocm-rel-6.2.3/pytorch_triton_rocm-2.3.0%2Brocm6.2.3.5a02332983-cp310-cp310-linux_x86_64.whl
pip3 uninstall torch torchvision pytorch-triton-rocm
pip3 install torch-2.3.0+rocm6.2.3-cp310-cp310-linux_x86_64.whl torchvision-0.18.0+rocm6.2.3-cp310-cp310-linux_x86_64.whl pytorch_triton_rocm-2.3.0+rocm6.2.3.5a02332983-cp310-cp310-linux_x86_64.whl
location=`pip show torch | grep Location | awk -F ": " '{print $2}'`
cd ${location}/torch/lib/
rm libhsa-runtime64.so*
cp /opt/rocm/lib/libhsa-runtime64.so.1.2 libhsa-runtime64.so