Too many SD logs causes cpulimit error
thujones opened this issue · 7 comments
Describe the bug
RM TX16s - One day when using specific model on radio, started getting error of cpulimit
Changing to other models works fine
Appears a large number of logs associated with model cause this error
To reproduce
- Set up any model
- In my case I have SD card logging enabled via Special Function when arm condition met
- Accumulate numerous logs (actual per model limitation not known, only that flushing log folder rectifies issue)
- Next startup receive error, presumably one of the startup functions is to parse log files and crashes when too many for the model are in there
4a. Able to work around as well by changing model name thus breaking link to SD card logs
Expected behavior
Clearly this is some technical limitation or similar, expectation is to have unlimited logging, however, I suppose some reasonable accommodation such as only look at last X number of logs to rectify problem.
Screenshots
N/A
Radio and model settings
Additional context
- Please provide all the version information requested by the documentation "Reporting Issues on Github"
- Please indicate the number of logs that typically cause the problem.
I am getting this same issue. I get an error on the telemetry page where the widget goes that says "ERROR in create(): CPU limi" Not the "t" I guess didn't fit in the message.
I got this on a tx16s in both edgetx 2.8.0 and 2.8.4, inav 6.0 and inav lua telemetry 2.2.0 & 2.2.1.
I started getting the issue with around 260 log files on my radio but I'm not sure of the exact number since I was doing a lot of testing to try and figure out what was going on. Not all these log files were for the model that I had issues with or even all from iNav. I only have one iNav model configured in this radio right now so I don't know if it would be a problem for all models using the widget or not and I already deleted the logs so I can't test adding the widget to other models. Yaapu telemetry loaded just fine for other models though. I also noticed that deleting the iNav telemetry script config file for this model and restarting the radio resolved the issue for one cycle but as soon as I changed a setting, changed model or rebooted the transmitter unit I would get the above error again. I have the logging enabled similar to @thujones post above so it makes new files every flight on the TX.
I don't think there is anything a Lua widget can do about this. The only way to detect how many files there are is to read the directory .... which is what the widget is doing when the error occurs. So this is most likely a firmware limitation that the script cannot handle (other than completely removing the log replay capability from EdgeTX). It is not an issue for OpenTX as the file names are determinable a priori and it's not necessary to read the directory.
You might consider raising an EdgeTX issue for the problem "Reading a directory with many files causes a CPU limit error'.
Some more details:
- Due to the EdgeTX method of naming log files (vice that of OpenTX), enumerating the "up to the last 5 files over the last 15 days" is CPU intensive for EdgeTX (its trivial in OpenTX). This is done to prepare the list of eligible files for "Log Replay".
- In general, the EdgeTX naming is preferable to that of OpenTX, other than for this use case.
- EdgeTX limits the amount of work a script can do at startup (
create()
). If this is exceeded the script aborts with "CPU limit" - The script can't determine apriori if this limit will be exceeded.
So the options are:
- Expect the user to limit the number of retained log files to some indeterminate "limit" (currently seems to be around 200 for safety); or
- Disable the log replay function for EdgeTX (or just generally).
I've never considered the log replay function to be of value, so I'm minded to the latter option, as expecting the user to remember to archive log files seem unlikely to be practical.
Personally now that I know what is causing the issue id rather just clean up the records more often and keep the option to use that feature even if I never do use it.
200+ log retainment is going to be pretty useless to the average user I would think but have to remember to tidy things up might be a PITA and deleting them from the radio in the field is a PITA.
Maybe a utility script to delete all the logs or something so when people see the error they can just go clear them from the radio then reboot?
For me its not a big deal at all now that I know what is causing it. I just wanted to add as much info as I could once I found it here on github.
Thanks for the informed / informative feedback -- it makes a difference.
ETX/OTX Lua is an extremely prescriptive and restrictive environment, in part due to the overarching requirement not to block the important business of keeping the vehicle under control. Thus solutions that might seem appropriate in other environments may not work here.
I did experiment with moving the log evaluation from the initial create()
to the menu being invoked; this merely postpones the issue until some degree of "more" files exist (512 files works, 1024 fails). So it's not an acceptable solution.
Rather than surprise people in the field ("wtf, it worked last flight!"), the only viable solution is to remove log replay for ETX.
TBD.
Time for plan B.
In order to replay logs on EdgeTX, the logs will have to be renamed (by the user) to conform with the OpenTX naming scheme, i.e. MODEL_NAME-YYYY-MM-DD.csv
.
Other than inconveniencing the user slightly, this means that:
- We do not run into
CPU limit
issues when the user has a large number of logs, as we never calldir()
- We can again replay logs on B&W radios, as we don't need
table.sort()
which EdgeTX only provides on colour radios. - We have consistent functionality and a single API across all firmware / radios.