Unexpected behavior from corrupt log data
Closed this issue · 10 comments
Our log-slicing capability assumes the TimeUS are non-decreasing as the Line Number increases. This is a good assumption... but a corrupt log means our slicing fails to perform as intended, and does not alert the user.
Of course we can't be robust to all kinds of corrupt data... but if we saw this issue happen more, we might consider doing something about it?
[Details]
Someone posted a log with a bugged value (probably some sort of corrupted-data). The file can be found on this forum post: https://discuss.ardupilot.org/t/arduplane-skywalker-help/33869?u=hunt0r
Look at the TimeUS values for the ARSP log message group. The problematic array-index is 6691. So the corresponding ARSP.LineNo(6691) = 652620. Check out the jump in TimeUS values around this index:
l.ARSP.TimeUS(6689:1:6693)
ans =
1.0e+18 *
0.000000001463860
0.000000001463959
4.213092985657280 <---- clearly wrong
0.000000001464060
0.000000001464160
The ARSP.TimeUS(6691) is 4.2131e+18. Compare that with ARSP.TimeUS(6690) = 1.4640e+09 and ARSP.TimeUS(6690) = 1.4641e+09 to see that it's ridiculously high. Or checkout the jump in DatenumUTC which is derived from TimeUS:
datestr(l.ARSP.DatenumUTC(6689:1:6693))
ans =
09-Oct-2018 16:37:34
09-Oct-2018 16:37:34
17-May-5526 23:20:47 <--- clearly wrong
09-Oct-2018 16:37:34
09-Oct-2018 16:37:34
When we slice based on TimeS, the corrupt value is guaranteed to be the "first" value over the limit, no matter what the slicing limit is, and it often becomes one of the endpoints of the slice, without the user understanding why. Then plotting sliced-data versus time looks very corrupted, because the corrupt value makes the plot limits silly.
I think it's both better and easier to remove corrupt messages from the log altogether.
In general I don't think it's straightforward to detect them, but in this particular case of a strictly monotonic field, ti's quite easy.
I think this is a very annoying issue.
Sometimes, there are many timestamps that are dramatically wrong. It's not easy to remove all of them.
I had to filter and look for spikes using "medfilt1" function to remove them. But is not always working.
Hi @isso11 ,
This issue was opened quite far back, and since then perhaps we have lost some "institutional knowledge".
Could you perhaps share a log where this issue is prevalent, so we can work towards a solution?
Thank you in advance.
I see. The original post did have a log that was still available.
What is your issue re. the corrupt timestamps? Is it that they break the slicing functionality of Ardupilog, or something else?
First of all, Ardupilog has been a great tool and often helps me massively. So many thanks for you.
My issue is that the slicing functionality always takes that distorted timestamp/s with it, no matter how far I pick starting time. I hope If there'd be some sort of filtering or fixing of the distortion.
First of all, Ardupilog has been a great tool and often helps me massively. So many thanks for you.
That's great to hear!
I'll see what can be done. Thanks for reporting it!
@isso11, hello!
I've tested once more slicing a log with corrupt dates. Due to the way slicing was implemented, slicing still works correctly even with corrupt timestamps.
For example, if your ATT topic has a corrupt message around 1000s and you slice from 500s to 1500s, that message will still be included.
This piece of code
sliced_log = log.getSlice([1000, 2000], 'TimeS');
plot(log.ARSP.TimeS)
plot(sliced_log.ARSP.TimeS)
will yield the correct data range for the sliced log.
Of course, the existence of the corrupt timestamp makes timeseries plotting unusable:
sliced_log.plot('ARSP/DiffPress');
So, I created a patch in the branch bugfix/corrupt_timestamps
. Do you think you can give it a try and see if it improves the situation?
@Georacer Hello!
I've tried it today with 3 different logs containing corrupted timestamps. It worked perfect with all of them.
Thank you for taking time and effort to resolve the issue.
Here's a sample:
Corrupted plot
Fixed
That's great to hear. Thanks for testing!