FW16 Freeze then Hang (FTH)
jcdutton opened this issue · 16 comments
Device Information
System Model or SKU
[ ] Framework Laptop 16 (AMD Ryzen™ 7040 Series)
No dGPU.
BIOS VERSION
3.0.5
Windows:
N/A
Linux:
Open a terminal and run the following command
sudo dmidecode --string bios-version
03.05
DIY Edition information
Memory: Manufacture and SKU
Kingston Fury Impact: Part Number: KF556S40-32
2x making 64GB total.
Storage: Manufacture and SKU
Model Number: WD_BLACK SN850X 1000GB
Firmware Version: 620361WD
Port/Peripheral information
- USB-C card, nothing plugged in.
- Empty
- Empty
- Empty
- USB-C card, FW16 PSU plugged in.
- USB-A card, nothing plugged in.
Standalone Operation
Are you running your mainboard as a standalone device. Is standalone mode enabled in the BIOS?
- No
Describe the bug
This has only happened to me once so far.
The symptoms are:
Power plugged in.
Playing a video.
Laptop hangs, screen freezes, laptop plays audio in a short loop.
Wait 20 seconds, no automatic reboot.
Wait 60 seconds, still no reboot.
Force reboot by pressing the power button for 10 seconds.
No logs are stored, so no crash log is available.
No pstore crash log output.
No useful S5_RESET_STATUS, because I had to manually long press the power button to reset it.
EC port80 output runs over the 4096 log limit I had so no useful output captured their either.
I think that while it was in the "hang" state, it was still outputting port 80 output (keeps repeating the pattern):
Log index, Port 80 Value, Port 80 Value in ASCII or a decoded value.
"00005324","e825f022",".%.."
"00005325","e825f028",".%.("
"00005326","e825f90e",".%.."
"00005327","e825f90d",".%.."
"00005328","e825f90e",".%.."
"00005329","e825f90d",".%.."
"0000532a","e825f022",".%.."
...
Those are all values it also outputs when the laptop is running fine.
Long press of power button to power off gives:
"000054af","00001001","(S3->S0)"
Steps To Reproduce
Steps to reproduce the behavior:
- Start from a powered off laptop.
- Power on laptop
- Wait a random amount of time. Play videos, netflix, youtube etc.
- System freezes and does not self-reboot.
Note: I generally have the power plugged in most of the time. For all the FTH I have seen, the power was plugged in at the time.
Expected behavior
It should not randomly freeze then hang. (FTH)
Screenshots
N/A
Operating System (please complete the following information):
- OS/Distribution: Linux/Ubuntu
- Version: 24.04
- Linux Kernel Version:
uname -a6.13.7 <- Mainline compiled kernel.
Additional context
Add any other context about the problem here.
Related? (Sync Flood)
#41 (comment)
Just saw another of these.
No extra info to report.
The (Sync Flood) one results in a automatic reboot after 20 seconds, so this is different, because there is no automatic reboot when this one happens.
I had disabled module "kvm", so no VMs were running, so this is not Virtual Machine related.
@jcdutton I assume you're using some form of disk encryption.. are you using luks/dm-crypt or perhaps SED/OPAL?
This is an issue I too am experiencing ( Linux ) and I'm not using luks/dm-crypt or SED/OPAL
Now using Linux 6.15-rc4 but this issue has been going on back starting with 6.12.0/6.12.1 and upgraded BIOS to 3.05 on 21st November 2024.
The common thing I've noticed with FTH and FTR ( #41 ) issues is none of these issues happen until I start changing between AC ( Framework 180W power adapter ) and DC ( battery ). One other saying the same back on the Framework community forum last time I checked
As long as I stay in AC after system is booted ( including suspend/resume ), all is good. based on what @jcdutton and some others have said, I think it's something to do with how the BIOS/EC is managing power state.
There was one bug in Linux sometime in 6.13 that caused a freeze but that was fixed and there were warnings/errors about it
The FTH/FTR, no warnings and/or errors and this has been going on since BIOS 3.05
I've disabled the battery extender since BIOS 3.04 and charge limit has been set to 80% since day one.
I am running my own firmware on the EC that fixes all the charge / discharge cycling bugs with the FW EC firmware. I wrote the bug fixes myself.
I observed the problem about once a month. It has not been a month yet, but since using my own EC firmware, I have not observed either the FTH or FTR problem.
For those interested, my EC Firmware changes and install instructions are here:
https://github.com/jcdutton/EmbeddedController/wiki
I have recently observed this problem again with my FW16 AMD 7840HS.
I cannot prove it 100% yet, but it appears to happen more often when the PSU is connected to port 4,5.
I used to use port 4,5 for PSU when I reported this problem initially, For whatever reason, I have been using port 1,2 for PSU, and did not see the problem. I recently switched back to port 4,5 and am seeing the problem again.
Has anyone else see any correlation between PSU port and the issue "FTH" appearing ?
For me it has often be with me playing a youtube video and then system hangs, and plays the audio in a loop from the speakers.
Another new observation:
While in "FTH" state, using EC CCD.
I see many "SM-RMI Error 4". About one a second, repeating.
"apreset" does nothing.
"apshutdown" does shut down the laptop, like switching it off.
Although the Capslock light toggles when pressed, no input from the keyboard reaches the CPU.
I know this because I have rigged "sysrq-h" to output a port80 code, and I don't see that code appear on the EC CCD.
So, the new ask is, does changing the port the PSU is plugged into change how often people see FTH ?
All the FTH/FTR issues I've had was using port 1 only or when using battery. I'll try 4 and 5 and get back to you on this
edit1: and is this with your EC firmware changes applied too?
edit2: I often find the FTR/FTH ( FTH more frequently ) happens when watching a video ( youtube mostly ) and CPU is fully loaded ( compiling software for example ), single monitor and it happens more frequently when multi monitor is active
I had another FTH today.
While in "FTH" state, using EC CCD.
I see many "SM-RMI Error 4". About one a second, repeating.
So, it seems that the CPU is not responding to anything the EC sends it over the SM-RMI bus when this problem shows itself.
I got a FTH today, while on battery. No PSU plugged in at all.
The same "SM-RMI Error 4" message on the EC console.
Maybe speaking too soon.
Using port 1, 4 and 5 so far and still getting FTH's ( not had any FTR's ) and doesn't feel much different
port 2 on the other hand ( so far ) no FTR or FTH. Compiling or running a game, watching youtube on secondary display, playing music, framework panel at 165Hz with VRR/Adaptive v-sync all at the same time
Sounding like a word vommit parrot here but I think I'm seeing a pattern ( triggering FTH/FTR ). I've noticed that if I have dual monitor ( second monitor is 3840x2160@60 ), youtube video playing in Firefox and doing something from moderate to heavy CPU usage ( I was running dota 2 spectating a game but compiling stuff works too ), system FTH's in less than 60 minutes most of the time but can be more than 60 minutes. Record is 7 minutes. This is giving higher chance to trigger FTR/FTH
Doing moderate to heavy CPU usage on it's own or youtube video on it's own and no dual monitor enabled, the system still FTH's but it usually days to weeks before it FTH's and the FTR issue is less common, like it barely happens
I'm using
Framework 16 AMD Ryzen 7 7840HS with Radeon 780M ( no dGPU )
RAM/Memory 32GB ( 2x16GB ) Framework
NVME 2280: Western Digital SN850X 2TB - Firmware 620361WD
NVME 2230: Western Digital SN770M 2TB - Firmware 731120WD
BIOS 3.05
Gentoo Linux 2.17 ( Linux 6.15.0 mainline realtime, compiled by clang 20.1.6 with march and mtune set to znver4 )
KDE Plasma 6.3.5 Wayland
Firefox 139.0.1 Wayland
- Port 1 - Framework audio module
- Port 2 - USB-C module - Framework 180W power adapter
- Port 3 - USB-C module - nothing plugged in
- Port 4 - USB-C module - USB-C to DisplayPort UGREEN 8K adapter with Dell P2145Q monitor
- Port 5 - USB-C module - nothing plugged in
- Port 6 - USB-C module - nothing plugged in
spoke too soon, it just FTH twice in port 2.
One of the things I've noticed after FTH, I reboot and then put my session back into the same state as my previous, it FTH in less than 10 minutes.
I've noticed this numerous times where I don't FTH for days or weeks ( this is on AC ) and when it does FTH, reboot and try to continue as before and it FTH in less than 10 minutes, it's like there something lingering for some time. So I put things in a less work load state ( as mentioned previously ), sometime later ( usually days ) I gradually increase work load again until FTH and this repeats.
I've bought Crucial 128GB Kit (64GBx2) DDR5-5600 SODIMM and installed today.
After installing the memory, I reset the BIOS and later did my usual stuff to try and trigger FTR/FTH and it FTH in less than an hour, not surprising but thought I add that just incase
updated to bios 3.06 beta
I did my usual stuff
in performance mode, ac adapter plugged in, multi-display, youtube playing on external display while game on primary display
in power-save mode, using battery, single display ( framework ), firefox browsing and lmstudio downloading a model ( no model loaded )
all within 5 hours, stills FTH.
At this point I think I'll talk to support
ok, maybe spoke to soon
After the (what I thought was FTH but a crash) I looked at the kernel logs and noticed that there was some errors related to tasks being soft locked and the battery was quite warm.
After rebooting, battery health went from 99% (before BIOS 3.06 update) this morning down to 62% after doing BIOS update and later crash.
No idea what's going on with the battery, I'll buy a new one...
how did my battery lose 32Wh~ in less than 6 hours and now it's only charging to 53Wh~ ( 85-86Wh~ earlier, BIOS update )
Does anybody else experience visual glitches on the frozen Framework 16 monitor?
Small black glitch in upper left corner with Fedora 16 Workstation (above "Settings"):
Small black glitch in upper left corner with Fedora 16 Sway:
Small part of screen repeated on whole screen with Fedora 16 Sway:
Video of Framework 16 monitor frozen and glitching on Arch Linux with Sway: https://share.mb.sb/framework_16_monitor_frozen_and_glitching.mp4
If external monitors are attached, they continue to work and the system stays responsive.
Suspending the laptop and unsuspending it returns the monitor to a normal state.
This also happens without the dGPU (AMD Radeon RX 7700S) and without external monitors attached.
The issue seems to occur more often when the GPU is being used, for example when performing a GPU benchmark.
I've seen these glitches with the following OSes:
- Fedora 16 Workstation
- Fedora 16 with Sway
- Arch Linux 6.16.2 with Sway
Hardware: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics.
BIOS version is 03.05.
Framework 16 AMD Ryzen 7 7840HS using Radeon 780M
RAM/Memory: 128GB ( 2x64GB ) - Crucial 128GB Kit (64GBx2) DDR5-5600
NVME 2280: Western Digital SN850X 2TB - Firmware 620361WD
NVME 2230: Western Digital SN770M 2TB - Firmware 731120WD
BIOS: 3.07
Gentoo Linux 2.17 ( Linux 6.17-rc6 mainline realtime, compiled by clang 21.1.1 march and mtune set to znver4)
KDE Plasma 6.4.5 Wayland
Well, with BIOS 3.05 and 3.06, I'm still getting FTH issues
The last two FTH's I've had seem to be happening during usage regardless of the processor work load ( CPU or iGPU or CPU+iGPU ) and performance profile and if I'm using battery or power adapter
I've upgraded to BIOS 3.07 and not done my usual triggers yet so I don't know what the FTH situation is like
While not related, I'm still experiencing FTR's with BIOS 3.05, 3.06 and now 3.07 which so far seems to be happening after resuming


