hd-zero/hdzero-goggle

9.3.0 -critical safety -Add confirmation overlay before entering go_sleep durring live view

Closed this issue Β· 43 comments

Critical:
Do not allow simple double function button press to initiate go_sleep function. The init of go_sleep function from live view should require button press conditions that include presses from two or more distinct buttons.

Some suggested alternate methods to init go sleep from live view:
1 -corded long press of both the enter and function buttons?
2 -add overlay confirmation dialog that requires an additional press from enter button on double function button press? (hide overlay dialog after a couple seconds if no enter button is pressed)
3 -add menu option to configure how user initiates go_sleep from live view? (not ideal)

My instance of critical safety issue:
On 2023-12-09 I experienced an erroneous initiation of the go_sleep function while in flight. After testing for suspected software issues and skimming through the code I have not been able to find any software related mechanism to reproduce this issue. This has left me to suspect that the fan in my HDZero Goggle may have forced humidity or other debris through the goggle in such a was as to cause erroneous right (function) button presses.

My conclusion:
I have identified the quick double function button press to initiate go_sleep function from live view as a new potential point of flight failure that should be considered a critical safety issue. (Even without identifying a conclusive reason for my instance of go_sleep initiation during flight.)

2023-12-16 Updated issue name from:
"9.3.0 -critical safety -fortify intent before go_sleep init from live view"

I now believe the best resolution is to add an OSD overlay requiring enter press similar to the existing "To R8?" style channel change dialog.

evilC commented

IDK man, I am kinda of the opinion that Go Sleep should simply not work AT ALL while a feed is active

Yeah, I agree. But how does the goggle identify "active"? Sadly, I think user determination is the only way to identify the received signal is "active".

On multiple instances I have found myself concerned and frustrated when my goggle suddenly reports "To **?" when I am trying to toggle the OSD and accidentally rotate the wheel.

Both the HDZero channel change and go_sleep functions are safety issues while in live view and the no_dial.txt does not appear to work as intended (disables OSD toggle too).

I would like a menu option to disable all functions with the potential to break live video while in live view (both To channel and go sleep) .

Maybe a safer solution is to use the function button as a function button and require the function button to be held for rotating the wheel and activating the To channel function or double click enter button for go sleep function.

other thoughts?
-p

evilC commented

I had this bug happen to me today. I had this PR flashed.
I had literally flashed this firmware the night before - it happened on the maiden flight of this firmware, after about a minute's flying.
I am pretty sure that I had not touched any buttons whatsoever since turning on the goggles.
I am not wholly sure what I was running before - I think it was the latest stable release (Rev 13112023)
We need to address this as a matter of urgency.

But how does the goggle identify "active"

The DVR starts / stops when you have a connection, so clearly it can tell

PR #367 allows user options that reduce my concern for breaking live view during flight. :-)
Thank you @Master92

more info:
I still had 9.3.0 release source locally and started to dig into user options to lock out specific button handling in live view... before things got serious I went to re-base and I was pleased to see loads of good changes in regard to button handling :-) I do still wish go_sleep to be easy to enter... I may still attempt to evaluate a go_sleep confirmation dialog overlay.

evilC commented

Confirmation overlay? You mean like a yes/no in the view? That does not sound very agreeable either. Can we not just have it so that Go Sleep simply does not work at all when in live view? I don't really see any benefit to allowing it if we are currently receiving a feed from the quad

evilC commented

Not sure I agree with "Always be safer"
You are arguing that it is safer to obstruct someone's view with a dialog, vs not obstruct someone's view.
Of course not obstructing their view is safer.
[Edit] Oh I think I see what you mean - racers are sharing a channel - next person powers up before they activate sleep, so they would need some way to enter sleep while receiving from someone else's quad? Yes, this poses a potential problem

Yeah, I agree. But how does the goggle identify "active"? Sadly, I think user determination is the only way to identify the received signal is "active".

I can send the FC's g_is_armed status to goggle.

I plan to let goggle reproduce the problem in debug mode first. I want to find the root cause of the error first.

From Ryan:
I had a scary thing happen with latest goggle firmware last night. I was flying and then the goggle beeped, the goggle lost video, the loading screen (progress bar) showed and 3sec later, the video was back.
I use the double press to sleep function often. Before this feature was released, I used it with no problems at a 3 day long event. I suspect the problem is unrelated to sleep function (at least how it was working before it was merged into release).
It seemed more like a crash of some kind or maybe ELRS backpack changing modes. I think this because the goggles beeped loudly when the video stopped. The beep was the same as when ELRS backpack changes video channels on the goggles.
I just tested the sleep function again. It does not beep when sleep function is started. But it does beep loudly when sleep mode is exited.

The fact that Ryan's instance appears to have woke itself back up automatically further aligns with my theory of erroneous button press detection(s). Can we push a new release that includes #367? Even if this does not prevent the issue it has made me feel much safer.
-p

Isolation thoughts:
-add audible key press indications for both brief and held states?

Other thoughts:
-environmental/atmospheric state? (add additional software de-bounce?)
-is ELRS able to trigger button presses?

I'm also curious if this issue already existed before I added the sleep shortcut. Maybe nobody noticed because the only action was centering the head tracker and too few people used this feature for this issue to arise. Or maybe a change in the FPGA firmware raised this erroneous double click detection. Afaik, this part is responsible for issuing right button clicks, right?

evilC commented

erroneous double click detection

Mine happened when the sleep shortcut was set to long press right button

evilC commented

I would also like to add that as far as I can see, no safeguards have currently been implemented.

  1. I suggested to simply not allow sleep if live feed is active.
  2. @pitts-mo suggested a confirmation if live feed is active, so that eg racers can still use it if someone started broadcasting on thier channel.
    I would guess implementing the dialog for (2) would take more work than (1)

Assuming that is the case, then in the short term, how about this.
Implement (1), and have it emit a beep if it decides to not shut down due to live feed.
That way, we can fly and if we see a suspected sleep but hear no beep, we know that it is not going through that code path.
If we hear an unexplained beep mid-flight, we know that it IS going through that code path.
Either that, or maybe just logging so we can go back and see which code path it took to trigger the sleep (Or even if it is truly a sleep)

Longer term, we can implement (2), but by implementing (1), we are not taking away anything that racers do not already have with a stable release, we are just not giving them the shortcut sleep for now.

@fromNansaram β€” 2023/12/27 17:09 (discord)
I have problems with this new Rev 13112023 fw. First, the goggles go sleep mode frequently with no button pressed. Second turn down two button sleep mode with tracking mode on, the goggles’s recording goes on and off frequently. I could not record just 2M flight recording at the race event. It record on and off like a crazy. I think right button input filter is shorten at this FW and cause noisy button press action caused these two problems.

The exact thing happened happened to me yesterday. I don't know if it is the sleep function that's gettng trigerred, feels like some kind of crash. Happened on two differnt goggles(check race video below).

Goggles screen goes blank mid flight and comes back when I scroll the wheel. Sometimes there is a beep when it goes blank, other times there is no beep.

Time stamps are in the video description when the goggles go blank. The event vrx is still picking up video no problem. Only lost a quad so far. haven't killed anyone yet.

https://www.youtube.com/watch?v=D9kypVX-QFM

I was trying to capture the crash, on the rtsp stream on my phone.
https://youtu.be/DAt7pS9gnJg

It happened a couple of times today, but this was the only time I was able to catch in while recording the phone screen.

@arulrajesh same thing here ! during flight goggles went back into source menu and beeped. same as yours, didnt always beeping. no keys pressed.

its a NO-GO!!!! these goggles are 1 year on market and all you get is a random video breakup DURING FLIGHT while goggle is resetting?

https://www.facebook.com/groups/hdzero/posts/747064013988229/?comment_id=747068557321108&reply_comment_id=747075000653797&notif_id=1705513245311524&notif_t=group_comment_mention

Hi all:
I made a big discovery today.
self_test.txt is the information printed during my test.
How to test:

  • Place the transmitter somewhere far away indoors, with 25mw output.
  • Removed all the antennas of Goggle, created self_test.txt in the root directory of the SD card, and turned off the OLED sleep function.
  • Do not perform any operations after starting goggle. Goggle can receive weak video.
  • After 10 minutes, remove the sd card and insert self_test.txt.

Searching for Cmd=20 in self_test.txt, I found a lot of them and they were all accompanied by UART1 bad command or UART2 bad command.Theoretically, when an error occurs, any function of the right key may be executed.

I suspect that there is an error when v536 receives and parses the 5680 data. I think the error probably occurs in void uart_parse(uint8_t sel, uint8_t *state, uint8_t *len, uint8_t *payload, uint8_t *payload_ptr)
Can anyone analyze this and fix it?

evilC commented

@ligenxxxx Does this repro only on fw with the sleep mode on button code, or does this repro on latest stable?
As mentioned in the FB group, my mate has been seeing something similar on latest stable.

@evilC it happens on latest, 'stable' goggle fw..

evilC commented

@bloudman648 I was asking about the self_test thing that ligen was talking about.
AFAIK, it is impossible for the scenario that he is talking about to trigger sleep on latest stable, because he states Theoretically, when an error occurs, any function of the right key may be executed. On the latest stable, right button is unable to trigger sleep because it does not contain the button shortcuts feature.
So this leads me to believe that what he describes is potentially not the sole cause, because it appears to also be happening (Or at least something similar) on latest stable.

Edited to remove erroneous statements

@evilC ok but what happened on me during flight was a short black screen with a beep and goggle went back into source menu DURING FLIGHT....
Newest 'stable' fw on hdz page. 9.3.0
I didnt touched every button and have turn off autopower function too..
I am not the only guy where it happened..

evilC commented

@bloudman648 I am not denying that it happened to you (I have seen it happen, so I know that it is possible), what I am saying is that AFAIK the scenario that ligen has just discovered (A way that right button presses could be triggered even when there was no actual button press) is probably not the cause if running latest stable.
In your case (Goes back to menu) - right button does not trigger that, left button does.
In other cases (Goggles go into sleep mode) - right button cannot trigger that because in latest stable, right button cannot trigger sleep mode because the code that does that is only in unreleased versions.
So it seems to me that we possibly have two or more (Possibly unrelated?) bugs.
Hence me asking him if he has been able to repro the UART bad command entry in the logs on latest stable, or if this just happens on experimental builds.
Bad Things happening on unreleased builds are bad enough, but Bad Things happening on latest stable is way, way worse.
This is, however, assuming that it can only trigger right button presses. I notice that he is seeing bad command for both UART1 and 2, so maybe he meant that it could also trigger a left button press, in which case what you are seeing COULD maybe be caused by what he has discovered.

Edited to remove erroneous statements.

in latest stable, right button cannot trigger sleep mode because the code that does that is only in unreleased versions.

That's not true, 9.3.0 introduced the shortcut for entering sleep mode from live video via a double right button click.

What I'm curious about is does this 0x20 command accompanied by the error also occur on a previous version? Because then, we should've heard about recordings aborting and head trackers resetting randomly much earlier shouldn't we?

@evilC gotcha but sometimes the goggles went also on sleep mode, so the left and right button will be pressed spontaneously altought they were not touched.

evilC commented

That's not true, 9.3.0 introduced the shortcut for entering sleep mode from live video via a double right button click.
Ah, I was unaware of this. Thanks for clearing that up.

WRT recordings, I have certainly seen in the past unexplained breaks in recordings - not something I have noticed at flight time, but upon reviewing recordings I have certainly noticed clips that start mid-flight when my hands clearly would have been on the sticks.
I don't use HT, so I cannot comment on that.
I have enquired with my friend if he has HT on or not. Hopefully he had it off, and this new finding explains all the scenarios.
@bloudman648 Right, this is what ligen was describing in his last post - he has seen this happen, so we are close to squashing this bug. The big question is, is Head Tracking turned on or off on your goggles?
It does seem odd, however, that you go back to source menu rather than entering sleep mode, so it seems possible that the same bug is also maybe capable of triggering left button presses as well as right button presses.

@evilC i am pretty sure that head tracking was turned off! Is that the reason what triggers the bug? Should i turn back on?

evilC commented

@bloudman648 We think the bug is triggered by the code thinking that the right button was pressed, when it was not.
On 9.3.0, if HT is turned off, then a double-press of the right button triggers sleep mode. So by turning HT on, you just make the phantom right button input re-centre HT, which effectively does nothing.
However, in your case, it seems that sleep mode is not being triggered, menu mode is (Are you 100% sure on this? You defo saw the menu right? Not just a progress bar and then blank screen?).
As a precaution though, if HT is turned off, turn it on, it can't really hurt.

@evilC ye, will turn in back on bud.
It happened 2 different scenarios...
Once the goggles went into sleep mode during flight and the other scenario got me back to the source menu, like u have hold the left button and gors automatically back to hdzero source ....so it seems to me that both buttons are getting triggered due to a software issue and loss of video feed DURING FLIGHT.

evilC commented

I just spoke with my mate who was seeing random sleeps on latest stable, and he had HT off, so this definitely seems to be consistent with the current theory.

@evilC good to hear....

evilC commented

Flashed. Am probably going to go out for a quick fly in a couple of hours, so should be able to give it a test

evilC commented

https://github.com/hd-zero/hdzero-goggle/actions/runs/7582786131 Please test this fw.

Tested for about 5 packs this afternoon and no problems.
Go Sleep was assigned to LP of right button - just as it was when I first encountered this issue. HT was off (Not that it matters with this version?)

Interestingly, another mate of mine has been on 9.3.0 for ages with HT off, and had no problems. The only difference between him and the other 2 of us who were seeing it was that he is on 1w, we are on 200mw. @ligenxxxx from your analysis previously, I get the impression that reproing this issue was easier with weak signal strength?

evilC commented

@pitts-mo - yeah, that was my interpretation of what he said too.
@ligenxxxx Had another 2 days of flying with this firmware. Another 2 mates tried it too. So altogether we've had 30+ flights with this firmware and no sign of any issues

@evilC im still scared for the last few times it happened...
the betrayal of trust is still there,
so i will only fly again if it's really fixed on the new fw...

any things to say? did u had head-tracking on or off? what about elrs backpack function?

greetings:-)

evilC commented

@bloudman648 On this firmware, HT on or off makes no difference, because whether the Sleep shortcut is enabled or not is no longer tied to HT being on or off, it's tied to which button press activates sleep mode. (ie what the settings are in the Input menu)
I made sure that sleep mode was assigned to right button (I used Long Press, my mates used double-click) - ie we did everything we could to TRY and cause the bug to happen, and we couldn't get it to happen.

@evilC what about elrs backpack feature for dvr recording?

9b4c78b

Interestingly...there already was a right button long press problem in 9.1.0 (june 23).
I think it never got fixed from 9.1 to 9.3?

I'm glad there is new progress on this bug. I've uploaded the changes to #386
When a complete osd packet from RF is received through dm5680, it will be sent to v536.
But data from RF is not entirely correct, especially when the signal is weak.
In the past, 5680 did not check the correctness of packets, resulting in sometimes packet len errors. (Exceptionally) excessively large and incorrect packet sizes would eventually cause errors when v536 received and parsed data from 5680.
Now every osd data packet from r will be checked and will be handed over to v536 for analysis after it is confirmed to be correct.

in the user manual its descriped, we only need the hdzero_goggle 9.x.x.bin file for updating without the other boot,os,rx,va - bin files. should we flash the fix version that way too ? just the 9.3.1 bin file ?