lack of retry of a session key if it errors out
nosmaster89 opened this issue · 3 comments
if the session key fails due to a sig error from the ecc, it will not re attempt to sign a session key. this leads to all data getting lost atleast untill a reconnection is atempted
2023-08-12T12:41:15.234016Z WARN run:run: gateway_rs::packet_router: failed to initialize session err=crypto error: signature error
after this it stops submitting packets to the HPR.
this has been observed on a heltec 1.1 running gwrs 1.1.1
if the session key fails due to a sig error from the ecc, it will not re attempt to sign a session key. this leads to all data getting lost atleast untill a reconnection is atempted
2023-08-12T12:41:15.234016Z WARN run:run: gateway_rs::packet_router: failed to initialize session err=crypto error: signature error
after this it stops submitting packets to the HPR. this has been observed on a heltec 1.1 running gwrs 1.1.1
right so the worst case reconnect attempt is 30 minutes later.. what would you expect it to do? Helltec can't even sign one thing to get to a session key for packets.
I do see it's not actually following the normal reconnect retry behavior on signature failure though which would at least start with a shorter reconnect time.
can we not catch the sig error in the driver and just retry after n , i acually think the problem is caused by not enough time before commands , when i was testing mfr on this device . if i got a sig error i could lock the chip by never waiting before trying again. waiting some time period seemed to release the chip again.
its not really a problem for 1 time signs its just a loss of poc aslong as the chips given time to rest. but if the chip is kept busy it may cause problems.
i was under the assumption that if the session key failed then gwrs would fall back to signing packets with the ecc.
can we not catch the sig error in the driver and just retry after n , i acually think the problem is caused by not enough time before commands , when i was testing mfr on this device . if i got a sig error i could lock the chip by never waiting before trying again. waiting some time period seemed to release the chip again.
No, after N is a thundering herd problem in the making. So this way it backs off after every failure but at least starts within 5 seconds instead of the current 30 minutes.
its not really a problem for 1 time signs its just a loss of poc aslong as the chips given time to rest. but if the chip is kept busy it may cause problems.
i was under the assumption that if the session key failed then gwrs would fall back to signing packets with the ecc.
That's not going to happen under the current HPR/gwrs assumptions.. In the register
message gwrs indicates it supports session keys.. So it should :-).
Doing "fallback signing" is not great for performance and weakens the gwrs assertion that is supports session keys. It would have to check both keys, which is not good for performance either