Combining Async_MQTT Generic and Portenta_H7_AsyncWebServer fails
javos65 opened this issue · 10 comments
Hi, back after some tests.
Combined 2 examples, exactly one-on-one: Portenta_H7_AsyncWebServer plus AsyncMQTT_Generic
Same issue as with the PubSubClinet library: Mbed OS crashes after webserver html calls within 1-2 minutes.
Problem : Both Async libraries can not co-exists.
Can' t post the issue at the Portenta_H7_AsyncWebServer git as its archived after our last mail exchange.
Jay
Arduino IDE 1.8.18
Arduino IDE 2.0
Portenta H7 rev2
lib Portenta_H7_AsyncWebServer 1.4.2
lib Portenta_H7_AsyncTCP 1.4.0
lib AsyncMQTT_Generic 1.8.0
HI @javos65
Good you've done some tests.
I'm afraid there is some issue either with Portenta_H7 / mbed / libraries or combined issues.
Try using new examples for ESP32 and RP2040W at
- AsyncWebServer_MQTT_RP2040W for RP2040W
- AsyncWebServer_MQTT for ESP32
As I don't have a working Portenta_H7 anymore, I don't think I can help anything here.
After testing these examples, if OK, you can post the issue on Arduino MBED core to ask for help. The issue might be very deep inside the Portenta_H7 core / libraries, because multi-core processing / managing , etc.
ESP32 and RP2040W are multi-core MPU, but still OK
Good Luck,
Thank you.
I look into some more details and testing and post it at Mbed support
My impression is thats indeed an mbed OS issue, maybe related to the Murata Wifi module drivers to support multiple clients.
close this case
Also try with previous core versions (v2.5.4-) to see if recent cores break something.
Tested it on various Mbed OS versions : 3.5.4, 3.5.1, 3.3.0, 2.8.0 and 2.5.2
All fail after requesting web-calls
See: https://github.com/javos65/AsyncWebServer_plus_MQTT
Please also test only the example Async_AdvancedWebServer_SendChunked.
If not working with previous versions, it could be a severe issue with the core mods, etc. since it was tested extensively then when Portenta_H7_AsyncWebServer v1.4.2 created
Tested the SendChunked version: fails as well
Tested a lean version - web-pages all inline-coded : fails as well, but takes a longer time.
(I reinstalled the toolchain and updated firmware prior to this testing - just to be sure.)
Hi @salasidis
Tested the SendChunked version: fails as well
Tested a lean version - web-pages all inline-coded : fails as well, but takes a longer time.
(I reinstalled the toolchain and updated firmware prior to this testing - just to be sure.)
Could you please check, if having spare time, why the code we tested and OK before, such as Async_AdvancedWebServer_SendChunked, etc. now suddenly can't run and just crashing. Very weird as we tested extensively and OK then. Is that something relating to the recent core mods ?
Can the recent core mods have something to do with your issue recently, posted in Async Web Server - becomes unresponsive after 1-4 days of use
I don't have the working Portenta_H7 now and can't know what's wrong. Can you help shedding some light.
I have not tried MQTT yet however, and it is on my list of things to add to the project.
My crashing has been going on for a very long time - even before these mods, and I have been unable to figure it out - so I don't think related to the mods (I did the mods to see if it could solve some of the intermittent crashing issues).
I then thought it was due to insufficient stack size on the LWIP thread, but even after increasing it (libmbed.a compile), it still crashed. It is possible that there is some lower level issue that is causing these failures with Portenta / MBed in the main Ethernet library (I ran my unit with no ethernet, and simply collecting sensor data and logging to SD, and it ran with no failures)
I know that the LWIP thread likes to take 5-8k of stack space in my case, but is only allocated 1200 by 3.5.4. I have recompiled libmbed.a to give it more space - maybe when MQTT is used this becomes more important??
As far as immediate crashes, - I am still sending 100+k web page with no issue. I am running 3.5.4. And have updated all the libraries. Do you have an example of where this fails in order to reproduce. I can try running it on one of the portentas. Is it only if MQTT is installed.
I have a J-link, and could single step through the code - if it crashes in 1-2 minutes, that may make it easier to debug (mycrash takes 3-7 days to occur). I would also compile it with the larger LWIP stack.
HI @salasidis
Sorry for not clear. The crashing code is from Portenta_H7_AsyncWebServer library, without MQTT yet, as @javos65 specified.
Can you make a copy of the issue to Portenta_H7_AsyncWebServer, which has been un-archived recentty.
I am sending 1 packet every 3 seconds, load a 100k web page 1-2x/day, and do an NTP time every hour. I also do a modbus poll every 5 seconds or so.
WIth all that the crash happens every 3-7 days. It happens at random times, or sometimes when the computer that has the running web page comes out of a locked screen - that was off. There are no stack overruns that I ever saw, there are no blasting retransmits happening (verified by wireshark).
I can run the example when available, and let you know (I have a 3rd unused - new portenta I will use for testing this).