amd/xdna-driver

DRM_IOCTL_AMDXDNA_GET_INFO IOCTL failed (err=95): Operation not supported

Closed this issue · 6 comments

# xrt_test

====== 0: npu3 xrt vadd started =====
DRM_IOCTL_AMDXDNA_GET_INFO IOCTL failed (err=95): Operation not supported
====== 0: npu3 xrt vadd FAILED  =====

1       test(s) executed
1       test(s) FAILED!
# xbutil validate -d 0000:c7:00.1

XRT build version: 2.17.0
Build hash:
Build date: 2024-06-10 20:36:38
Git branch:
PID: 2214
UID: 0
[Thu Jun 13 10:05:29 2024 GMT]
HOST: m-kot
EXE: /usr/bin/unwrapped/xbutil2
[xbutil] ERROR: DRM_IOCTL_AMDXDNA_GET_INFO IOCTL failed (err=95): Operation not supported
# dmesg | grep -v input | grep -E "xdna|npu|xocl|xclmgmt|drm|gpu"
[    1.347323] xocl: loading out-of-tree module taints kernel.
[    1.356150] xclmgmt init()
[    2.554377] systemd[1]: Starting Load Kernel Module drm...
[    2.560314] systemd[1]: modprobe@drm.service: Deactivated successfully.
[    2.560358] systemd[1]: Finished Load Kernel Module drm.
[    2.801077] Loading firmware: amdnpu/1502_00/npu.sbin
[    2.802016] amdxdna 0000:c7:00.1: enabling device (0000 -> 0002)
[    2.809259] amdxdna 0000:c7:00.1: (Develop) IOMMU mode is 0
[    2.840447] amdxdna 0000:c7:00.1: set mpnpu_clock = 600 mhz
[    2.860449] amdxdna 0000:c7:00.1: set npu_hclock = 1024 mhz
[    2.894357] [drm] amdgpu kernel modesetting enabled.
[    2.897619] amdgpu: Virtual CRAT table created for CPU
[    2.897626] amdgpu: Topology: Add CPU node
[    2.897713] amdgpu 0000:c6:00.0: enabling device (0006 -> 0007)
[    2.897734] [drm] initializing kernel modesetting (IP DISCOVERY 0x1002:0x1900 0x1002:0x0124 0xC5).
[    2.897741] [drm] register mmio base: 0xDC500000
[    2.897741] [drm] register mmio size: 524288
[    2.900727] [drm] add ip block number 0 <soc21_common>
[    2.900730] [drm] add ip block number 1 <gmc_v11_0>
[    2.900732] [drm] add ip block number 2 <ih_v6_0>
[    2.900734] [drm] add ip block number 3 <psp>
[    2.900736] [drm] add ip block number 4 <smu>
[    2.900737] [drm] add ip block number 5 <dm>
[    2.900739] [drm] add ip block number 6 <gfx_v11_0>
[    2.900740] [drm] add ip block number 7 <sdma_v6_0>
[    2.900742] [drm] add ip block number 8 <vcn_v4_0>
[    2.900744] [drm] add ip block number 9 <jpeg_v4_0>
[    2.900745] [drm] add ip block number 10 <mes_v11_0>
[    2.900761] amdgpu 0000:c6:00.0: amdgpu: Fetched VBIOS from VFCT
[    2.900764] amdgpu: ATOM BIOS: 113-PHXGENERIC-001
[    2.900777] Loading firmware: amdgpu/psp_13_0_4_toc.bin
[    2.901513] [drm] Initialized amdxdna_accel_driver 1.0.0 20240124 for 0000:c7:00.1 on minor 0
[    2.901597] Loading firmware: amdgpu/psp_13_0_4_ta.bin
[    2.902331] Loading firmware: amdgpu/dcn_3_1_4_dmcub.bin
[    2.903122] Loading firmware: amdgpu/gc_11_0_1_pfp.bin
[    2.903583] Loading firmware: amdgpu/gc_11_0_1_me.bin
[    2.904080] Loading firmware: amdgpu/gc_11_0_1_rlc.bin
[    2.904605] Loading firmware: amdgpu/gc_11_0_1_mec.bin
[    2.905238] Loading firmware: amdgpu/gc_11_0_1_imu.bin
[    2.905714] Loading firmware: amdgpu/sdma_6_0_1.bin
[    2.906029] [drm] VCN(0) encode/decode are enabled in VM mode
[    2.906031] Loading firmware: amdgpu/vcn_4_0_2.bin
[    2.906770] amdgpu 0000:c6:00.0: [drm:jpeg_v4_0_early_init [amdgpu]] JPEG decode is enabled in VM mode
[    2.906882] Loading firmware: amdgpu/gc_11_0_1_mes_2.bin
[    2.907442] Loading firmware: amdgpu/gc_11_0_1_mes1.bin
[    2.908065] amdgpu 0000:c6:00.0: vgaarb: deactivate vga console
[    2.908068] amdgpu 0000:c6:00.0: amdgpu: Trusted Memory Zone (TMZ) feature enabled
[    2.908092] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
[    2.908113] amdgpu 0000:c6:00.0: amdgpu: VRAM: 4096M 0x0000008000000000 - 0x00000080FFFFFFFF (4096M used)
[    2.908115] amdgpu 0000:c6:00.0: amdgpu: GART: 512M 0x00007FFF00000000 - 0x00007FFF1FFFFFFF
[    2.908134] [drm] Detected VRAM RAM=4096M, BAR=4096M
[    2.908135] [drm] RAM width 64bits DDR5
[    2.908208] [drm] amdgpu: 4096M of VRAM memory ready
[    2.908209] [drm] amdgpu: 13942M of GTT memory ready.
[    2.908219] [drm] GART: num cpu pages 131072, num gpu pages 131072
[    2.908448] [drm] PCIE GART of 512M enabled (table at 0x00000080FFD00000).
[    2.908724] [drm] Loading DMUB firmware via PSP: version=0x08003A00
[    2.909005] [drm] Found VCN firmware Version ENC: 1.19 DEC: 7 VEP: 0 Revision: 13
[    2.909008] amdgpu 0000:c6:00.0: amdgpu: Will use PSP to load VCN firmware
[    2.932862] [drm] reserve 0x4000000 from 0x80f8000000 for PSP TMR
[    3.475099] amdgpu 0000:c6:00.0: amdgpu: RAS: optional ras ta ucode is not available
[    3.482664] amdgpu 0000:c6:00.0: amdgpu: RAP: optional rap ta ucode is not available
[    3.482668] amdgpu 0000:c6:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
[    3.514961] amdgpu 0000:c6:00.0: amdgpu: SMU is initialized successfully!
[    3.514965] [drm] Seamless boot condition check passed
[    3.516033] [drm] Display Core v3.2.266 initialized on DCN 3.1.4
[    3.516037] [drm] DP-HDMI FRL PCON supported
[    3.518631] [drm] DMUB hardware initialized: version=0x08003A00
[    3.520335] snd_hda_intel 0000:c6:00.1: bound 0000:c6:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
[    3.588227] [drm] kiq ring mec 3 pipe 1 q 0
[    3.590651] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[    3.590677] amdgpu 0000:c6:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
[    3.659561] amdgpu: HMM registered 4096MB device memory
[    3.660114] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[    3.660126] kfd kfd: amdgpu: Total number of KFD nodes to be created: 1
[    3.660221] amdgpu: Virtual CRAT table created for GPU
[    3.660492] amdgpu: Topology: Add dGPU node [0x1900:0x1002]
[    3.660494] kfd kfd: amdgpu: added device 1002:1900
[    3.660503] amdgpu 0000:c6:00.0: amdgpu: SE 1, SH per SE 2, CU per SH 6, active_cu_number 12
[    3.660507] amdgpu 0000:c6:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[    3.660509] amdgpu 0000:c6:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[    3.660510] amdgpu 0000:c6:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[    3.660511] amdgpu 0000:c6:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[    3.660512] amdgpu 0000:c6:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[    3.660513] amdgpu 0000:c6:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[    3.660514] amdgpu 0000:c6:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[    3.660515] amdgpu 0000:c6:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[    3.660516] amdgpu 0000:c6:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[    3.660517] amdgpu 0000:c6:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[    3.660518] amdgpu 0000:c6:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[    3.660519] amdgpu 0000:c6:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[    3.660521] amdgpu 0000:c6:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[    3.661404] [drm] ring gfx_32768.1.1 was added
[    3.661896] [drm] ring compute_32768.2.2 was added
[    3.662375] [drm] ring sdma_32768.3.3 was added
[    3.662420] [drm] ring gfx_32768.1.1 ib test pass
[    3.662451] [drm] ring compute_32768.2.2 ib test pass
[    3.662501] [drm] ring sdma_32768.3.3 ib test pass
[    3.664571] [drm] Initialized amdgpu 3.57.0 20150101 for 0000:c6:00.0 on minor 0
[    3.667849] fbcon: amdgpudrmfb (fb0) is primary device
[    3.667975] [drm] DSC precompute is not needed.
[    4.097544] amdgpu 0000:c6:00.0: [drm] REG_WAIT timeout 1us * 100000 tries - optc314_disable_crtc line:148
[    4.152546] amdgpu 0000:c6:00.0: [drm] fb0: amdgpudrmfb frame buffer device
[   53.990860] amdxdna 0000:c7:00.1: set mpnpu_clock = 600 mhz
[   54.010771] amdxdna 0000:c7:00.1: set npu_hclock = 1024 mhz
[   54.052663] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[   59.637423] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[   61.574206] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  122.652191] amdxdna 0000:c7:00.1: set mpnpu_clock = 600 mhz
[  122.672252] amdxdna 0000:c7:00.1: set npu_hclock = 1024 mhz
[  122.714141] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  293.601336] amdxdna 0000:c7:00.1: set mpnpu_clock = 600 mhz
[  293.621373] amdxdna 0000:c7:00.1: set npu_hclock = 1024 mhz
[  293.663138] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  300.966818] amdxdna 0000:c7:00.1: set mpnpu_clock = 600 mhz
[  300.987233] amdxdna 0000:c7:00.1: set npu_hclock = 1024 mhz
[  301.030168] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.751547] amdxdna 0000:c7:00.1: set mpnpu_clock = 600 mhz
[  314.771241] amdxdna 0000:c7:00.1: set npu_hclock = 1024 mhz
[  314.813866] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.814563] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.814690] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.814809] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.814936] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.815040] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.815167] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.815267] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.815369] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.815464] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.815567] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.815655] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.815748] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.815837] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.815929] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.816025] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.816115] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.816209] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.816299] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.816386] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.816482] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.816573] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.816671] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.816766] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.816858] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.816951] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  314.817045] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  370.301712] amdxdna 0000:c7:00.1: set mpnpu_clock = 600 mhz
[  370.321703] amdxdna 0000:c7:00.1: set npu_hclock = 1024 mhz
[  370.363371] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  891.645155] amdxdna 0000:c7:00.1: set mpnpu_clock = 600 mhz
[  891.661209] amdxdna 0000:c7:00.1: set npu_hclock = 1024 mhz
[  891.702813] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  900.371245] amdxdna 0000:c7:00.1: set mpnpu_clock = 600 mhz
[  900.391240] amdxdna 0000:c7:00.1: set npu_hclock = 1024 mhz
[  900.433023] amdxdna 0000:c7:00.1: aie2_get_info: Not supported request parameter 7
[  906.238892] amdxdna 0000:c7:00.1: set mpnpu_clock = 600 mhz
[  906.258981] amdxdna 0000:c7:00.1: set npu_hclock = 1024 mhz

NPU3 platform is not yet supported.

But the previous build works, like "xbutil validate -d 0000:c7:00.1" which validates correctly(and the "xbutil examine" runs correctly).

NPU3 platform is not yet supported.

The code requesting DRM_IOCTL_AMDXDNA_GET_INFO doesn't seem to exist before commit 7682e0b right?

Were you testing the latest driver in this repo? I thought we have moved to XRT version 2.18. But you're running XRT 2.17? How did you build and install XRT on your system?

Were you testing the latest driver in this repo? I thought we have moved to XRT version 2.18. But you're running XRT 2.17? How did you build and install XRT on your system?

The xdna-driver is the latest version. Packaged with portage, and changed its installation path with some patches.
In the build, xdna-driver's libxrt_core library and libxrt_coreutils would conflict with the installed XRT libraries(version 2.17), so they were removed and xrt 2.17 https://github.com/Xilinx/XRT/tree/202410.2.17.319 was used.

Please follow instructions listed in the readme and build XRT from the submodule directory. It does not work with some random version of XRT.