E (381850) TRANSPORT_BASE: poll_read select error 104, errno = Connection reset by peer, fd = 54
Closed this issue · 4 comments
This is an intermittent error. But it should be reproducible in less than 5 minutes.
To reproduce:
Set up a channel between alice (normal node) and bob (remote signer node).
Send ~10 keysends in a loop from bob to alice, and you will eventually get these logs on the signer side:
I (16510) sphinx_key::core::events: => starting the main signing loop...
I (23870) lightning_signer::node: 02e7 adding payment 4c403adc53b8a7f1ef7a7d82c0db4c79a90a13d1895b0ec7aa455db1e230cb14 -> 1000
I (150710) lightning_signer::node: 02e7 adding payment 11f8be10d081a434410385ee1fe11797f6678b9ce108ef5e97bf3480b22598f0 -> 1000
I (155970) lightning_signer::node: 02e7 adding payment 9fd2fa7307dde80a792fa9ce19765edae0877c6ee0fdaa2492d78a6fffa84fcf -> 1000
I (161090) lightning_signer::node: 02e7 adding payment 58d30b2de3b73636a5b9f887054d970072e5786339bb9d4df4fd1f1853f91f32 -> 1000
I (185870) lightning_signer::node: 02e7 adding payment 611de4aa05b070ce2c086e9056e13b031f9e90e2844e2002ae69a90fbec42d30 -> 1000
I (194260) lightning_signer::node: 02e7 adding payment 52b59ac6af9e54c33fbc6d03cfdeab2b8ed45a111f029e55d70c37a1185a4b07 -> 1000
I (200610) lightning_signer::node: 02e7 adding payment 1a70363b588a1663ea030de8ab04ebb1e31257301e6da797d9d7a2117f247424 -> 1000
I (224370) lightning_signer::node: 02e7 adding payment 7b9229746eae58ee85ea25897cfc88688b3e7cbf48150983b42860bc6ab40f92 -> 1000
I (229290) lightning_signer::node: 02e7 adding payment 85bd0ad72d1acf2ad12fd9d3e60c63d6aa935b255ed46d1ca6aea2053d6744c8 -> 1000
I (235640) lightning_signer::node: 02e7 adding payment 10a87920743b52ee54052ded7670374d5ad5e4396610c66cc3c597226acc119e -> 1000
I (377150) lightning_signer::node: 02e7 adding payment fa544eb8313195048b104b07bdb4cbf2a2e9f8c6fbf15470e1300ba596f092b1 -> 1000
I (381450) lightning_signer::node: 02e7 adding payment 6bafccba49567a3d4144d959ec03d9b7865c68407b007ecb392ec1e80eb77e33 -> 1000
E (381850) TRANSPORT_BASE: poll_read select error 104, errno = Connection reset by peer, fd = 54
E (381850) MQTT_CLIENT: Poll read error: 119, aborting connection
E (381860) TRANSPORT_BASE: poll_write select error 0, errno = Success, fd = 54
W (381870) TRANSPORT_BASE: Poll timeout or error, errno=Success, fd=54, timeout_ms=10000
E (381880) MQTT_CLIENT: Writing failed: errno=0
E (381880) sphinx_key::conn::mqtt: ESP_FAIL msg!
W (381890) sphinx_key::conn::mqtt: RECEIVED Disconnected MESSAGE
W (381890) MQTT_CLIENT: Publish: Losing qos0 data when client not connected
W (381900) sphinx_key::conn::mqtt: RECEIVED Disconnected MESSAGE
Guru Meditation Error: Core 0 panic'ed (Illegal instruction). Exception was unhandled.
@Evanfeenstra pretty sure the reason this error happens is because of the following complaint from rumqttd
. Seems like this is the part that's complaining first. Then we get the problem on the signer side described above.
[2023-06-14T16:15:37.436 hsmd /sphinx_key_broker::looper DEBUG] SEND ON sphinx
[2023-06-14T16:15:37.436 hsmd /sphinx_key_broker::mqtt DEBUG] SENDING TO F96yex3J on topic sphinx
[2023-06-14T16:15:37.885 hsmd /sphinx_key_broker::looper DEBUG] GOT ON sphinx-return
[2023-06-14T16:15:37.889 hsmd /sphinx_key_broker::looper DEBUG] SEND ON sphinx
[2023-06-14T16:15:37.890 hsmd /sphinx_key_broker::mqtt DEBUG] SENDING TO F96yex3J on topic sphinx
[2023-06-14T16:15:38.286 hsmd /rumqttd::server::broker ERROR] Disconnected!! error=Network(Protocol(PayloadSizeLimitExceeded(5261)))
Check out this issue in esp-idf espressif/esp-idf#10000
Looks like its fixed in esp-idf > v5. So we would need to update our ESP-IDF-SYS to 0.33.1
@Evanfeenstra don't really see how that issue relates to this one ? Happy to try and see if it solves the problem though :)