assertion "netconn state error" in lwip when iterating a 2nd time through MQTT Start, Publish, Stop
Closed this issue · 3 comments
We would like to use your library on ESP32 MCU's with the latest ESP-IDF to send sensor data every 15 minutes to a MQTT Server. The programme will establish each time a Wifi connection, and make the MQTT connection, and publish its metrics using MQTT, and then end the MQTT connection and the Wifi connection.
This works fine for the 1st iteration but during the 2nd iteration it crashes the program consistently.
Sequence:
Start Wifi
esp_mqtt_start()
esp_mqtt_publish()
esp_mqtt_stop()
Stop Wifi
[wait 15 minutes in Production]
Start Wifi
esp_mqtt_start()
esp_mqtt_publish()
=> crash
The following thread stack is about the modules lwip and tcpip_adapter and these are triggered by this esp-mqtt library. The last significant log messages (a full coredump does not show more thread stack information unfortunately):
assertion "netconn state error" failed: file "C:/myiot/esp/esp-idf/components/lwip/api/api_msg.c", line 1055, function: lwip_netconn_do_delconn
abort() was called at PC 0x400d2ad7 on core 0
0x400d2ad7: __assert_func at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdlib/
../../../.././newlib/libc/stdlib/assert.c:63 (discriminator 8)
Backtrace: 0x4008c478:0x3ffbdb20 0x4008c61b:0x3ffbdb40 0x400d2ad7:0x3ffbdb60 0x4011ea47:0x3ffbdb90 0x4010fda5:0x3ffbdbb0
0x4008c478: invoke_abort at C:/myiot/esp/esp-idf/components/esp32/panic.c:648
0x4008c61b: abort at C:/myiot/esp/esp-idf/components/esp32/panic.c:648
0x400d2ad7: __assert_func at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdlib/
../../../.././newlib/libc/stdlib/assert.c:63 (discriminator 8)
0x4011ea47: lwip_netconn_do_delconn at C:/myiot/esp/esp-idf/components/lwip/api/api_msg.c:1881 (discriminator 6)
0x4010fda5: tcpip_thread at C:/myiot/esp/esp-idf/components/lwip/api/tcpip.c:474
CPU halted.
The program works fine when keeping the Wifi logic but removing the MQTT logic.
A reproducible ESP-IDF project can be found at https://github.com/pantaluna/support_esp_mqtt
@Important The instructions and the full make-monitor-logs are in the README.md
Thanks for your help.
Thanks for the report! As I'm on holidays I won't be able to do any tests myself until next week. But, I think I know where the crash is coming from. I realized that the disconnect is not properly cleaning up resources. While that is usually not a problem and LWIP is doing that for us, in your case where the whole stack gets reinitialized this will cause crashes.
The function around line https://github.com/256dpi/esp-mqtt/blob/master/esp_lwmqtt.c#L52 should probably be like that:
void esp_lwmqtt_network_disconnect(esp_lwmqtt_network_t *network) {
// immediately return if conn is not set
if (network->conn == NULL) {
return;
}
// delete connection
netconn_delete(network->conn);
// reset network
network->conn = NULL;
network->rest_buf = NULL;
network->rest_len = 0;
}
If you have time, you could test my hunch, otherwise I will conduct some tests myself next week.
Your suggested code change fixes the problem. I assume you update the master branch after the holidays. Thanks!
Thanks for testing. I had some time today and published v0.4.4
with the fix.