FrameworkComputer/SoftwareFirmwareIssueTracker

CPU frequency stuck at low values after suspend/resume on firmware 3.06

Opened this issue · 17 comments

Device Information

System Model or SKU

Please select one of the following

  • Framework Laptop 13 (11th Gen Intel® Core™)
  • Framework Laptop 13 (12th Gen Intel® Core™)
  • Framework Laptop 13 (13th Gen Intel® Core™)
  • Framework Laptop 13 (AMD Ryzen™ 7040 Series)
  • Framework Laptop 13 (Intel® Core™ Ultra Series 1)
  • Framework Laptop 16 (AMD Ryzen™ 7040 Series)

BIOS VERSION

3.06

DIY Edition information

If you are experiencing an issue on a DIY system, Please also fill out the memory and storage devices you are using.

Memory: 2 x Kingston KF556S40-32
Storage: WD Black SN850X

Describe the bug

After upgrading to firmware version 3.06, the system exhibits CPU throttling following a suspend/resume cycle.

  • The issue only occurs if the laptop remains suspended for a longer period. A quick suspend/resume does not trigger it.
  • So far, it reliably reproduces when the laptop is connected to the PSU. I have not yet tested battery-only scenarios but will provide feedback in a follow-up comment.
  • Once resumed, CPU frequencies remain limited between ~540 MHz and 1.1 GHz, resulting in a noticeable performance drop.

Additional observations:

  • According to cpufreq sysfs, the configured min/max CPU speeds are 1.1 GHz and ~5.1 GHz respectively.
  • Switching power profiles or adjusting the energy performance preference does not restore normal operation.
  • Disconnecting and reconnecting the PSU immediately resolves the throttling.

I will collect and attach EC logs from ectool console in the comments for further analysis.

Steps To Reproduce

Steps to reproduce the behavior (Linux):

  1. Connect PSU.
  2. Suspend the system.
  3. Keep it suspended for a while (~1h)
  4. Resume the system and monitor CPU frequency on idle and load. None of it exceeds 1100Mhz, bottom remains around 540Mhz.

Expected behavior

Observing CPU frequencies above 4GHz under load.

Operating System (please complete the following information):

  • OS/Distribution: Arch Linux
  • Linux Kernel Version: 6.16.1

Additional context

Discussed on the Framework Community Forum: here

EC logs after changing governor / energy performance preference:

[261734.398100 AC BEST PERFORMANCE]
[261734.450700 PMF: SPL 30000mW, sPPT 30000mW, fPPT 30000mW, p3T 163352mW, ao_sppt 0mW]
PORT80: 0020
[261748.857500 AC BALANCED]
[261748.893800 PMF: SPL 30000mW, sPPT 30000mW, fPPT 30000mW, p3T 59752mW, ao_sppt 0mW]

EC logs after disconnecting and reconnecting the PSU:

[262039.947200 cypd_write_reg8_wait_ack pre 0x84 ]
[262039.950100 PORT_DISCONNECT]
[262039.951000 events = 2, pre_events = 0]
[262039.952000 set AP throttling type 1 to on (0x00000010)]
[262039.956100 board_set_active_charge_port port -1, prev:3]
[262039.961500 event set 0x0400000000000000]
[262039.961300 cypd_write_reg8_wait_ack pre 0x4 ]
[262039.976600 cypd_write_reg8_wait_ack C:1 0x2032 response 0x0]
[262039.982600 event set 0x0400000000000000]
[262039.981000 cypd_cfet_vbus_control:3 fail:5]
[262039.994900 cypd_write_reg8_wait_ack C:0 0x1032 response 0x0]
[262039.996800 cypd_cfet_vbus_control:0 fail:5]
[262040.009100 cypd_write_reg8_wait_ack pre 0x0 ]
[262040.022900 cypd_write_reg8_wait_ack C:1 0x2032 response 0x0]
[262040.024500 cypd_cfet_vbus_control:3 fail:5]
[262040.025800 event set 0x0100000000000000]
[262040.081700 PMF: SPL 40000mW, sPPT 48000mW, fPPT 58000mW, p3T 118000mW, ao_sppt 0mW]
[262040.083500 events = 0, pre_events = 2]
[262040.084600 set AP throttling type 1 to off (0x00000000)]
[262040.085800 Updating charger with EPR correction: ma 440]
[262040.089600 CL: p-1 s-1 i500 v0]
[262040.091100 update charger!!]
[262040.099500 AC off]
[262040.101000 event set 0x0000000000000010]
[262040.109900 TODO Implement pd_set_new_power_request port 3]
PORT80: 3F40
[262040.115000 set AP throttling type 1 to off (0x00000000)]
[262040.121000 DC BALANCED]
[262040.122100 event set 0x0100000000000000]
[262040.152700 Battery 98% (Display 99.9 %) / 6h:40 to empty, not accepting current]
[262040.181100 PMF: SPL 30000mW, sPPT 36000mW, fPPT 44000mW, p3T 118000mW, ao_sppt 0mW]
PORT80: AA8F
[262040.223500 cypd_update_power_status:0=0x8]
[262040.227000 cypd_update_power_status:1=0x8]
PORT80: AA8E
[262040.705500 Battery 98% (Display 97.2 %) / 6h:40 to empty, not accepting current]
PORT80: 0008
[262041.215900 DC BATTERY SAVER]
[262041.216900 event set 0x0100000000000000]
[262041.266200 PMF: SPL 20000mW, sPPT 20000mW, fPPT 20000mW, p3T 118000mW, ao_sppt 0mW]
PORT80: AA8F


[262043.025400 CYPD_RESPONSE_PORT_CONNECT 3]
[262043.030100 board_set_active_charge_port port 3, prev:-1]
[262043.034600 event set 0x0400000000000000]
[262043.051600 cypd_write_reg8_wait_ack pre 0x80 ]
[262043.055700 event set 0x0400000000000000]
[262043.073200 Updating charger with EPR correction: ma 2640]
[262043.075500 event set 0x0400000000000000]
[262043.094000 CL: p3 s1 i3000 v5000]
[262043.095600 update charger!!]
[262043.101400 AC BEST EFFICIENCY]
[262043.102300 event set 0x0100000000000000]
[262043.112700 AC on]
[262043.114200 event set 0x0000000000000008]
[262043.123400 TODO Implement pd_set_new_power_request port 3]
PORT80: 0004
PORT80: AA8E
PORT80: AA8F
[262043.163000 CCG_RESPONSE_ACCEPT_MSG_RX 3]
[262043.164100 Updating charger with EPR correction: ma 440]
[262043.169700 sustain_battery_soc: Switched control mode to DISCHARGE]
[262043.173800 Battery 98% (Display 97.2 %) / 6h:45 to empty, not accepting current]
[262043.183200 cypd_update_power_status:0=0xe]
[262043.186800 cypd_update_power_status:1=0xe]
[262043.238300 Battery 98% (Display 99.9 %) / 6h:45 to empty, not accepting current]
[262043.295500 event set 0x0400000000000000]
[262043.304700 CYPD_RESPONSE_PD_CONTRACT_NEGOTIATION_COMPLETE 3]
[262043.308900 board_set_active_charge_port port 3, prev:3]
[262043.315500 cypd_write_reg8_wait_ack pre 0x84 ]
[262043.318100 event set 0x0400000000000000]
[262043.329100 AC BALANCED]
[262043.331400 event set 0x0100000000000000]
[262043.342000 event set 0x0400000000000000]
[262043.369000 event set 0x0400000000000000]
[262043.387000 event set 0x0400000000000000]
[262043.405900 event set 0x0400000000000000]
[262043.420400 PMF: SPL 30000mW, sPPT 30000mW, fPPT 30000mW, p3T 59752mW, ao_sppt 0mW]
[262043.422700 Updating charger with EPR correction: ma 2860]
[262043.434200 CL: p3 s0 i3250 v20000]
[262043.435400 TODO Implement pd_set_new_power_request port 3]
PORT80: 003B
PORT80: 0020

When the issue occurred, what was the power mode on your system?
Was it "Performance"? I've only been able to reproduce the issue when the power mode is set to "Performance."

I followed your steps to reproduce the issue with the power mode set to "Performance." When the system resumes from suspend, some cores run at 1100MHz while others run at 544MHz. If I then run a stress test, all cores drop to 544MHz.

I indeed forgot to mention. Because of FrameworkComputer/EmbeddedController@3ed7daa I checked both - balanced and performance, and both produced the same result. Up to you folks to find out the problem now, my guessing game resonates now around interaction between EC and UEFI, or some kind of missing event/update upon wakeup.

Despite you having a repro now, I remain happy to be a guinea pig if needed.

Let me re-iterate my last comment - I mostly keep my system on balanced, to avoid excessive heating. I only switched to performance once to test it after I spotted the change of behavior in the mentioned commit, as having it on performance should produce same results as before the diff.

Also, what may make a difference, when I was testing it before I had my travel 100W PSU connected, not the Framework 180W.

Encountered this with the 3.06 update as well. I didn't look into it, as a revert back to 3.05 fixed it. Here is what I do have.

  • Happened with the FW 180W power supply, nothing between it and the mainboard.

  • The initial update to 3.06, did not cause any issues, and it did not have any initial issues on port 1. I don't know how much faith to put into this. The machine may have stay powered after the update. I did use hibernate which requires powers off, but the plug itself never came out. It did sleep or restart several times though.

  • The machine was powered off, unplugged, and powered back on (from a dead battery) with power in port 4. That's when the issue happened.

  • No amount of off/off, restarts, state toggles, unplugs resulted in anything faster than 500Mhz.

  • A downgrade to 3.05 did fix it.

What I didn't do .

  • Power the machine in port 1 before the downgrade. (That port may have been unaffected)
  • Power the machine on battery (It might be related to this note in the 3.05 bios)

Fix issue where the CPU will be stuck at 500Mhz if the system is powered from an >100W charger through some brands of MFD hubs.

Some other facts, I mostly use port 4 to power the device. It is a >100W charger, it's the FW 180W, the machine has a dGPU, and plugs directly into the module, with no hubs in between. It has also never experienced an issue with throttling previously, and port 4 is the primary port used to power the device (Hence the revert).

We found the issue was caused by a change that adjusts the GPU power limit based on its power state. The original power limit code didn't control this well. After recently refactoring the power limit code, we can no longer reproduce the issue, though it is still under validation. Since the new code has a different structure, we don't think it's worth creating another solution for the current EC code. We plan to revert the change and have 3.07 BIOS for now and will soon release a new BIOS that includes the refactored power limit code.

and will soon release a new BIOS that includes the refactored power limit code.

Any news on when 'soon' is? I'm running on 3.06 and the bug is extremely annoying in day 2 day work...

Thanks

@quinchou77 👋🏻 Hi there! I'm just trying to understand what you said above. Is this correct:

  • 3.06: Current release. Has bug. (Stuff busted)
  • 3.07: BETA / Under validation. Has reverted code but not the new code fix (Stuff busted, but not as bad)
  • 3.08: TBA. This will have the fix + new code (FIX)

is this correct?

@PureKrome Basically, it is correct.
3.07: revert the dGPU BOCO mode solution, which means the behavior will be the same to the 3.05.
4.00: TBA. We do the code refactor in this version since we would like to support both AMD 7700 and NV 5070, which we just announced. This refactored code also fixes the power limit issue and includes support for dGPU BOCO mode.

3.07 is alpha for internal testing now. It will be moved to beta next Monday.

@quinchou77 - thank you kindly for the prompt updates. really appreciate them!

Hope 4.00 isn't too far away, then.

3.07 is alpha for internal testing now. It will be moved to beta next Monday.

Thank you for the updates. For now, I have downgraded back to 3.05 as 3.06 is practically unusable without constant workarounds to get it out of the 544MHz lock.

We plan to conduct the internal 4.00 alpha test next week, followed by the beta release two weeks later. Version 3.07 will still be released as stable, even with the known issue that 240W power causes PROHOT.

Thanks @quinchou77 - really, really appreciate the update and work you are all doing. Looking really forward to the 4.00 public release.

We plan to conduct the internal 4.00 alpha test next week, followed by the beta release two weeks later. Version 3.07 will still be released as stable, even with the known issue that 240W power causes PROHOT.

Correction, PROCHOT is caused by ANY power supply, not only 240W.

I hope you guys resolve it.