Freezes after some hours running
Closed this issue · 29 comments
Mine always freezes, after a few hours of running, with the LEDs staying on and showing the last active thingy of whatever was running.
Also, Ticking Clock just does not work, just shows dark and no LEDs light up, both when cycled with a button or web interface.
Otherwise, it all works until it freezes.
The reset button on ESP32 does not help when it freezes.
Super annoying when freezes and needs a power reset =(
Tested with several clean installs (build/upload) of the latest original project, over the last month.
ESP32 by AZDelivery Dev Kit with CP2102, tested 3 units, same behaviour.
Tested several different high-output power supplies, and do not think it's a power issue.
Any ideas about what might be the problem?
Thanks!
In had a similar problem with the clock (didn’t try it with the other displays) when the wireless connection was weak: the clock just stood still.
Maybe you can check the signal strength of the esp32 in your router. That was the only hint for me, why my device was suddenly stop working.
In the end my device was connecting to an access point, which had the most distance to the device. I guess the roaming does not work (properly). So after I managed to connect the device to an AP literally in sight of the device, everything works fine now.
I have to confess to have the same problem - which kills the project a bit.
For me it worked well for several weeks, but now it will not last 24h. Did yours stuck from the beginning?
How hot do your devices get? I Assume, mine was killed by the temperature...
May be a new 32 with a heat sink would do...
Thanks for comments!
It was behaving like described from very beginning.
I tried with piece of aluminum as heat sink last days, still freezes. However, worked a bit longer, a bit more than 24h
Do not think it's wifi, as it happens to all other plug-ins, that do not require Internet (I guess). But will try to test this lead also
Having same problem here! It might be related to wifi, the AP might disappear at night. The clock seems to freeze so now and often and then I need to power cycle it. This was not the case previously
I had the same issue. My suspicion was an unavailable ntp server, due to connection reset through ISP over night. So I set up my router as a time server and pointed the variable "NTP_SERVER" in "constants.h" to the router. Maybe a coincident, but since then the clock didn't freeze anymore
Thanks for your reply! Looks like that might be the source of the problem.
Unfortunately can not use this solution, as it would require custom firmware for the router and not all routers can do that.
I wonder if anyone else has encountered such behaviour and has come up with a solution in the code itself?
A short follow up: I installed new router firmware (Fritz!Box v. 07.81) and returned to old problems. However - one thing is for sure: as soon as a little more wireless traffic occurs the clock freezes. It seems that this is reproduceable: Copy large data packages through WiFi and the clock freezes - alternatively disconnect the internet connection. So basically the firmware is very sensible to (internet) connections. To make a long story short: I think the issue is not limited to NTP-Protocol. To me it seems like a general connection issue
I also tried a mesh repeater at first attempts. It didn't help. I can give another try now, with new router firmware and will let you know
Wrong:! Easy to reproduce: Clock froze immediately - no matter if there was a mesh repeater or not
I need to update the previous statement: The clock didn#T connect to the AP with better WiFi conditions. After setting up the repeater and afterwards starting the clock, I was able to copy larger amounts of data through WiFi and the clock didn't freeze.
Seems to be the same behaviour I experienced: 3 AP, one FritzBox, the LED panel near the FritzBox (see #112 (comment)) . For whatever reason the ESP did connect to one of the AP, in worst case to the one with the weakest signal. I assume the wireless library does not handle wifi steering properly.
I did disconnect all AP, started the ESP so it could only connect to the FritzBox. Since then I never had the problem again. The ESP always connects to the FritzBox.
The clock seems to freeze, when the wifi signal is poor and the ESP can't receive a proper time signal. I assume some error handling is missing, but that is a task for someone with more code experience :-)
Let me know, if you share my opinion: This might be two separate issues - one having the client not to chose the best connection (which would be a general issue) and the other letting the clock freeze without a proper connection.
Yes, these are 2 separate issues.
Any updates on this one?
On the one hand, it was good to hear that it wasn't only my problem, but on the other hand, it looks like it is not an easy fix.
Unfortunately, my (LabVIEW) background does not allow me(read: give any success chance) to delve into the original code to fix it.
Rather more experiences than hard facts: I understood that the router manufacterer (AVM) confirmed a WiFi related bug/issue with the latest stable firmware release which affects my router model (7590ax). This bug has not been fixed in stable release yet.
As soon as I updated to this 'buggy' firmware, frequent crashes occured on the clock. To me it seems more and more obvious that the clock behaves very sensitive to WiFi connectivity issues. Even it doesn't help, maybe someone else could try to probe or revoke my statements. I was able to reproduce crashes by copying large files through WiFi. Let us know your experiences.
The bug only creeped in after a couple of updates. I did have quite a stable clock at some point,
It might have started after the code for WifiManager was added I could try to revive an older build to see if that resolves the issue.
It might be worth a try to revert this code basically
72cc87f
I have the same issue, so I can reproduce the situation.
I will try to fix the issue in the next days.
I was able to (more or less stable) reproduce the situation with putting the esp/lamp between the router and a repeater and then turn on the microwave (that is close to the repeater). :)
In parallel the esp was attached to the dev IDE and I monitored the serial output.
Looks like there is a connection loss and then the device is switching to WiFi setup mode.
This is causing a freeze of the clock and looks like a complete freeze of the device.
Lost connection to Wi-Fi. Reconnecting...
*wm:AutoConnect
*wm:Connecting to SAVED AP: XXXXXXX
E (644180) wifi:sta is connecting, return error
[635121][E][WiFiSTA.cpp:317] begin(): connect failed! 0x3007
*wm:connectTimeout not set, ESP waitForConnectResult...
*wm:AutoConnect: FAILED for 869 ms
*wm:StartAP with SSID: Ikea Display Setup WiFi
*wm:AP IP address: 192.168.4.1
*wm:Starting Web Portal
Not sure yet how to fix/improve that. But maybe this information is allready helpful. If you have the freeze situation on your devices, can you please check if the WiFi-Setup mode is enabled and the SSID: Ikea Display Setup WiFi shows up in a WiFi search?
I will try some potential fixes in the next days.
This confirms my suspicion that the issue is due to the wifimanager changes. The setup used to work fine for days before.
I was able to debug the situation without microwave :) by running around with the device attached to my laptop. That was causing connection losses and the observed bahaviour.
I was now able to find a fix by changing / adding some more configuration to the wifimanager.
In the main.cpp in line 82 I added these lines:
wifiManager.setConnectRetries(10);
wifiManager.setConnectTimeout(10);
wifiManager.setWiFiAutoReconnect(true);
This is working fine and the device is reconnecting to my local wifi after a connection loss.
In a longer connection loss situaiton the device will still switch to the wifimanager configuration mode. If you don't want that or want to recover from that state once your local wifi is up again you can add an additional:
wifiManager.setConfigPortalTimeout(180);
That will after 3 minutes stop the wifimanager configuration mode and rerun a connect retry. During that time the clock will not work, so the clock is still relaing on a permanent wifi connection. An offline mode would be possible but a larger code change.
Do you want me to add this as a pull request or something?
Thanks @luedi128 , I just tested the suggested code, and it seems to work fine and solve the issue of Wi-Fi disconnect! I will let it run for longer and report if anything!
@luedi128 could you create a PR for this? I have tried these options but for me a freeze still happened during the night.
@jekkos but in the freeze state you see the configuration wifi from the WifiManager if you scan for wifi networks? Only then you have the same situaiton as I had.
Generally if you only want the clock to run no mater what happens to the wifi connection you can maybe as well change the loop function in the main.c to something like this. This will just avoid the WifiManager form starting again once you got a connection and the current date/time. Sorry, this is just a code idea, I have not tested that code.
struct tm timeinfo;
void loop()
{
Messages.scrollMessageEveryMinute();
pluginManager.runActivePlugin();
if (!getLocalTime(&timeinfo) && WiFi.status() != WL_CONNECTED && millis() - lastConnectionAttempt > connectionInterval)
{
Serial.println("Lost connection to Wi-Fi. Reconnecting...");
connectToWiFi();
delay(connectionInterval);
}
@jekkos but in the freeze state you see the configuration wifi from the WifiManager if you scan for wifi networks? Only then you have the same situaiton as I had. Generally if you only want the clock to run no mater what happens to the wifi connection you can maybe as well change the loop function in the main.c to something like this. This will just avoid the WifiManager form starting again once you got a connection and the current date/time. Sorry, this is just a code idea, I have not tested that code.
struct tm timeinfo; void loop() { Messages.scrollMessageEveryMinute(); pluginManager.runActivePlugin(); if (!getLocalTime(&timeinfo) && WiFi.status() != WL_CONNECTED && millis() - lastConnectionAttempt > connectionInterval) { Serial.println("Lost connection to Wi-Fi. Reconnecting..."); connectToWiFi(); delay(connectionInterval); }
Thanks for the suggestion, I was able to flash this using the webserver OTA and the clock keeps running now, also after disconnect. Which is great.
I forgot that my router's wifi signal is turned off at night.