espressif/esp-idf

ESP32-S3 Deep Sleep from APP CPU hangs and then watchdog resets (IDFGH-6949)

tannewt opened this issue · 6 comments

Environment

  • Development Kit: ESP32-DevKitC
  • Kit version (for WroverKit/PicoKit/DevKitC): v1
  • Module or chip used: ESP-32-S3
  • IDF version (run git describe --tags to find it):
    ESP-IDF v4.4-351-g121ddb87e5-dirty
  • Build System: Make
  • Compiler version (run xtensa-esp32-elf-gcc --version to find it):
    xtensa-esp32s3-elf-gcc (crosstool-NG esp-2021r2-patch3) 8.4.0
  • Operating System: Linux
  • Using an IDE?: No
  • Power Supply: USB

Problem Description

The chip fails to deep sleep from the APP core. Instead it restarts immediately due to a watchdog restart.

It hangs here in the IDF: https://github.com/adafruit/esp-idf/blob/circuitpython-v4.4/components/esp_hw_support/sleep_modes.c#L509

Expected Behavior

The chip should enter deep sleep.

Actual Behavior

The chip resets due to a watchdog timeout and wakes up.

Steps to reproduce

  1. Create a task on core 1.
  2. Call esp_deep_sleep_start().

Code to reproduce this issue

CircuitPython calling esp_deep_sleep_start() from core 1 (the APP core).

https://github.com/adafruit/circuitpython/blob/main/ports/espressif/common-hal/alarm/__init__.c#L178

Debug Logs

I (4080) gpio: GPIO[48]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0 
Deep sleep start
Syncing counters
Grabbing lock
Stalling other CPU
ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x8 (TG1WDT_SYS_RST),boot:0x8 (SPI_FAST_FLASH_BOOT)
Saved PC:0x40374200
SPIWP:0xee
mode:DIO, clock div:2
load:0x3fcd0108,len:0x1660
load:0x403b6000,len:0xb7c
load:0x403ba000,len:0x2f78
entry 0x403b6248
I (31) boot: ESP-IDF v4.4-351-g121ddb87e5-dirty 2nd stage bootloader
I (31) boot: compile time 14:34:46
I (32) boot: chip revision: 0
I (35) boot.esp32s3: Boot SPI Speed : 40MHz
I (40) boot.esp32s3: SPI Mode       : DIO
I (44) boot.esp32s3: SPI Flash Size : 8MB
W (49) boot.esp32s3: PRO CPU has been reset by WDT.
W (55) boot.esp32s3: APP CPU has been reset by WDT.
I (60) boot: Enabling RNG early entropy source...
I (66) boot: Partition Table:
I (69) boot: ## Label            Usage          Type ST Offset   Length
I (76) boot:  0 nvs              WiFi data        01 02 00009000 00005000
I (84) boot:  1 otadata          OTA data         01 00 0000e000 00002000
I (91) boot:  2 ota_0            OTA app          00 10 00010000 00200000
I (99) boot:  3 ota_1            OTA app          00 11 00210000 00200000
I (106) boot:  4 uf2              factory app      00 00 00410000 00040000
I (114) boot:  5 user_fs          Unknown data     01 81 00450000 003b0000
I (121) boot: End of partition table
I (126) boot: Defaulting to factory image
E (130) esp_image: image at 0x410000 has invalid magic byte (nothing flashed here?)
E (139) boot: Factory app partition is not bootable
I (144) esp_image: segment 0: paddr=00010020 vaddr=3c110020 size=49300h (299776) map
I (227) esp_image: segment 1: paddr=00059328 vaddr=3fc994f0 size=052bch ( 21180) load
I (233) esp_image: segment 2: paddr=0005e5ec vaddr=40374000 size=01a2ch (  6700) load
I (236) esp_image: segment 3: paddr=00060020 vaddr=42000020 size=10cba8h (1100712) map
I (515) esp_image: segment 4: paddr=0016cbd0 vaddr=40375a2c size=13abch ( 80572) load
I (539) esp_image: segment 5: paddr=00180694 vaddr=50000000 size=01010h (  4112) load
I (540) esp_image: segment 6: paddr=001816ac vaddr=600fe000 size=00028h (    40) load
I (554) boot: Loaded app from partition at offset 0x10000
I (554) boot: Disabling RNG early entropy source...
I (567) spiram: Found 16MBit SPI RAM device
I (568) spiram: SPI RAM mode: sram 80m
I (568) spiram: PSRAM initialized, cache is in normal (1-core) mode.
I (573) cpu_start: Pro cpu up.
I (576) cpu_start: Starting app cpu, entry point is 0x40378398
I (0) cpu_start: App cpu up.
I (812) spiram: SPI SRAM memory test OK
I (821) cpu_start: Pro cpu start user code
I (821) cpu_start: cpu freq: 240000000
I (822) cpu_start: Application information:
I (824) cpu_start: Project name:     circuitpython
I (830) cpu_start: App version:      7.3.0-alpha.0-124-g3089c9b7f-di
I (837) cpu_start: Compile time:     Mar 11 2022 14:34:44
I (843) cpu_start: ELF file SHA256:  4c1bb5053083140e...
I (849) cpu_start: ESP-IDF:          v4.4-351-g121ddb87e5-dirty
I (856) heap_init: Initializing. RAM available for dynamic allocation:
I (863) heap_init: At 3FCA9740 len 000368C0 (218 KiB): D/IRAM
I (869) heap_init: At 3FCE0000 len 0000EE34 (59 KiB): STACK/DRAM
I (876) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
I (882) heap_init: At 600FE028 len 00001FD8 (7 KiB): RTCRAM
I (889) spi_flash: detected chip: gd
I (893) spi_flash: flash io: dio
I (897) sleep: Configure to isolate all GPIO pins in sleep state
I (903) sleep: Enable automatic switching of GPIO sleep configuration
I (910) coexist: coexist rom version e7ae62f
I (916) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.

Other items if possible

Original CircuitPython bug is here: adafruit/circuitpython#6090

Any update on this?

z7jzk commented

I am also having this issue on an ESP32-S3 (did not have the same issue on ESP32). Any update?

i got the same issuse

Hi @tannewt! Sorry, I'm so late.
I am looking into this issue and can not reproduce it. I use the v4.4-351-g121ddb87e5 as you said and run the example below with the default sdkconfig options. Maybe some config options that you have set affect this issue?

#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "esp_log.h"
#include "esp_sleep.h"
static const char *TAG = "example";
RTC_DATA_ATTR static int boot_count = 0;

void test_task(void* ptr)
{
    const int deep_sleep_sec = 10;
    //ESP_LOGI(TAG, "Entering deep sleep");
    esp_deep_sleep(1000000LL * deep_sleep_sec);
}

void app_main(void)
{
    ++boot_count;
    ESP_LOGI(TAG, "Boot count: %d", boot_count);
    xTaskCreatePinnedToCore(test_task, "test_task", 4095, NULL, 3, NULL, 1);
}

It hangs here in the IDF: https://github.com/adafruit/esp-idf/blob/circuitpython-v4.4/components/esp_hw_support/sleep_modes.c#L509

What exactly was it pointed out? The line number does not make sense now, I think it points to the wrong line.
I see your print Stalling other CPU, Does it mean that stalling other CPU is not successful? Do you think this line makes this issue? esp_ipc_isr_stall_other_cpu() https://github.com/adafruit/esp-idf/blob/circuitpython-v4.4/components/esp_hw_support/sleep_modes.c#L534

@tannewt @z7jzk @Fuhua-Chen Could you give some more details that will help to reproduce this issue?
Thanks in advance.

Yes, it was hanging in the stall_other_cpu call and then hitting the interrupt watchdog. We are not using xTaskCreate to create the other task. Instead, we set CONFIG_ESP_MAIN_TASK_AFFINITY_CPU1 and all of app_main is run on the APP core.

You can mention me on the Adafruit Discord if you want to coordinate on it. (I'm pacific US timezone)

Ok, I found the issue. We're not using the IDF build system because we have our own. We were missing the -u ld_include_highint_hdl portion of the link command that keeps the highint isr and the ipc_isr_handler that stalls the other cpu. So, this isn't an IDF issue. Sorry! For others who have had this problem, make sure you map file has something for ipc_isr_handler.