Yakifo/amqtt

Bug: Cancellation of MQTT publish in `_broadcast_loop` leaves the broker non functional

not-f-elsner opened this issue · 0 comments

If a running task in the _broadcast_loop gets cancelled, the whole loop is terminated and the broker left in a non functional state. No more messages will be broadcasted until the broker is shut down and restarted.

It seems there is no additional logic in the broker which would handle the _broadcast_loop terminating (or raising an exception).

Not sure how to replicate this in a unit test, but I can successfully replicate it locally with unclean client disconnect/reconnects.

How to fix
Easiest fix would be to never exit the loop, e.g. simply discard any single tasks that have been cancelled without terminating the whole loop. It is currently cancelled when shutting down the broker.

If we need this logic, we might need more complex waiter logic too (like is done for client connections) to restart the loop after it was cancelled when not shut down intentionally.

Will provide a PR as soon as I'm sure how to adapt the code.