Zero action causing "panic: notify: cannot decode internal mask"
imsodin opened this issue · 3 comments
The panic "notify: cannot decode internal mask" re-occurred since deploying notify with the additional info in the panic message (#194). The action code is 0
. The full trace can be found here: https://crash.syncthing.net/report/a269c4d92ca4513fc34c79277ea3eaeb42ab7ca9a561899639895ef404570e48
Unless the cause of this is know and fixable (@ppknap ?) it's probably best to just drop events with a zero action instead of panicking, agreed?
hey @imsodin ! Indeed, it's likely that we are hitting "leftovers" from the buffer now. We can try skipping the panic(or rather printing the msg in dbg mode and then returning). However, I'm not sure what is the Win completion port status when this happens, we might run into infinite loops if CPs are not able to recover.
I will send a PR with the patch but will also need someone to stress test that on Windows before we move forward.
So there's a possibility that instead of panicking, it will just silently go into a infinite busy-loop generating zero events instead, right?
The problem is the stress testing done here is releasing it to all of Syncthing users (a lot) and waiting for anonymous crash reports. So if infinite loops are a concern, I'd propose (temporarily) adding some counter/timer that still causes a panic if too many zero events happen back-to-back in some short interval. I believe a panic is preferable to silently looping. Does that make any sense?
And just for context: We are talking about ~600 reported occurrence by ~150 users in the last 14 days, out of minimum 25k active and reporting users (likely quite a bit more, we don't have stats on people with crash reporting enabled, at least not easily available).