RX680M full system crash
Closed this issue · 8 comments
I'm not sure if this is a radeontop bug or kernel bug, but when I try to run radeontop on Ryzen 6900hs iGPU(RX680M), the system crashes completely and reboots every time. No logs survive the crash. The crash started happening after this commit e3bbf06
The laptop is Asus g14 and it's on arch kernel 6.0.12, but I could reproduce it on earlier kernels too.
Reverting the commit fixed the crash for me on 6.1.0+ with Ryzen 6850u
That is my commit. It shouldn't have enabled itself on 6900 as that's RDNA2.
What are the following?:
- What is the family name in the header at the top while running?
- What is your GPU's line from
lspci -nn
? - What is the output of
vainfo
? - Does it still crash if you completely delete the part enabling it?:
Lines 374 to 379 in e3bbf06
- if I revert the commit the family name is
YELLOW_CARP
. Otherwise it crashes. 07:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt [Radeon 680M] [1002:1681] (rev c7)
-
vainfo
Trying display: wayland
vainfo: VA-API version: 1.17 (libva 2.17.1)
vainfo: Driver version: Mesa Gallium driver 22.3.5 for AMD Radeon Graphics (rembrandt, LLVM 15.0.7, DRM 3.49, 6.1.12-arch1-1)
vainfo: Supported profile and entrypoints
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
VAProfileH264Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointEncSlice
VAProfileH264High : VAEntrypointVLD
VAProfileH264High : VAEntrypointEncSlice
VAProfileHEVCMain : VAEntrypointVLD
VAProfileHEVCMain : VAEntrypointEncSlice
VAProfileHEVCMain10 : VAEntrypointVLD
VAProfileHEVCMain10 : VAEntrypointEncSlice
VAProfileJPEGBaseline : VAEntrypointVLD
VAProfileVP9Profile0 : VAEntrypointVLD
VAProfileVP9Profile2 : VAEntrypointVLD
VAProfileAV1Profile0 : VAEntrypointVLD
VAProfileNone : VAEntrypointVideoProc
- Yes, it crashes even if I comment out the section you mentioned
The common denominator between you and other affected users seems to be laptop APUs. I don't have any laptops with AMD APUs so test on. The memory regions and registers might be different from my desktop GPUs that I tested on, causing the crashes.
It would be good to consider disabling my video encode/decode detection feature for all laptop APUs until we confirm the proper registers.
Never mind my idea about laptop APUs. The actual problem was more serious in that I put the if
statement around the display code but forgot the if
statements when actually reading the memory. I fixed that in #152. Does that PR fix your crashes?
Yes, this commit indeed fixed my crash, thank you for fixing it :)
I there is something weird going on with the AMD gpu firmware though, I don't get how it's possible to crash the whole system from user space like this without any kernel logs or anything.