worker dies from not being able to connect to rabbitmq
msramalho opened this issue · 2 comments
msramalho commented
full logs:
[2023-12-20 22:44:36,188: INFO/MainProcess] Task app.worker.main.transcribe[<JOB_ID>] succeeded in 291.4594302158803s: None
[2023-12-20 22:44:36,190: CRITICAL/MainProcess] Couldn't ack 3, reason:ConnectionResetError(104, 'Connection reset by peer')
Traceback (most recent call last):
File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 131, in ack_log_error
self.ack(multiple=multiple)
File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 126, in ack
self.channel.basic_ack(self.delivery_tag, multiple=multiple)
File "/opt/venv/lib/python3.11/site-packages/amqp/channel.py", line 1407, in basic_ack
return self.send_method(
^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/amqp/abstract_channel.py", line 70, in send_method
conn.frame_writer(1, self.channel_id, sig, args, content)
File "/opt/venv/lib/python3.11/site-packages/amqp/method_framing.py", line 186, in write_frame
write(buffer_store.view[:offset])
File "/opt/venv/lib/python3.11/site-packages/amqp/transport.py", line 347, in write
self._write(s)
ConnectionResetError: [Errno 104] Connection reset by peer
[2023-12-20 22:44:36,194: CRITICAL/MainProcess] Couldn't ack 2, reason:BrokenPipeError(32, 'Broken pipe')
Traceback (most recent call last):
File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 131, in ack_log_error
self.ack(multiple=multiple)
File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 126, in ack
self.channel.basic_ack(self.delivery_tag, multiple=multiple)
File "/opt/venv/lib/python3.11/site-packages/amqp/channel.py", line 1407, in basic_ack
return self.send_method(
^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/amqp/abstract_channel.py", line 70, in send_method
conn.frame_writer(1, self.channel_id, sig, args, content)
File "/opt/venv/lib/python3.11/site-packages/amqp/method_framing.py", line 186, in write_frame
write(buffer_store.view[:offset])
File "/opt/venv/lib/python3.11/site-packages/amqp/transport.py", line 347, in write
self._write(s)
BrokenPipeError: [Errno 32] Broken pipe
[2023-12-20 22:44:36,195: CRITICAL/MainProcess] Couldn't ack 5, reason:BrokenPipeError(32, 'Broken pipe')
Traceback (most recent call last):
File "/opt/venv/lib/python3.11/site-packages/celery/worker/loops.py", line 97, in asynloop
next(loop)
File "/opt/venv/lib/python3.11/site-packages/kombu/asynchronous/hub.py", line 306, in create_loop
item()
File "/opt/venv/lib/python3.11/site-packages/vine/promises.py", line 161, in __call__
return self.throw()
^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/vine/promises.py", line 158, in __call__
retval = fun(*final_args, **final_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 131, in ack_log_error
self.ack(multiple=multiple)
File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 126, in ack
self.channel.basic_ack(self.delivery_tag, multiple=multiple)
File "/opt/venv/lib/python3.11/site-packages/amqp/channel.py", line 1407, in basic_ack
return self.send_method(
^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/amqp/abstract_channel.py", line 70, in send_method
conn.frame_writer(1, self.channel_id, sig, args, content)
File "/opt/venv/lib/python3.11/site-packages/amqp/method_framing.py", line 186, in write_frame
write(buffer_store.view[:offset])
File "/opt/venv/lib/python3.11/site-packages/amqp/transport.py", line 347, in write
self._write(s)
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 131, in ack_log_error
self.ack(multiple=multiple)
File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 126, in ack
self.channel.basic_ack(self.delivery_tag, multiple=multiple)
File "/opt/venv/lib/python3.11/site-packages/amqp/channel.py", line 1407, in basic_ack
return self.send_method(
^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/amqp/abstract_channel.py", line 70, in send_method
conn.frame_writer(1, self.channel_id, sig, args, content)
File "/opt/venv/lib/python3.11/site-packages/amqp/method_framing.py", line 186, in write_frame
write(buffer_store.view[:offset])
File "/opt/venv/lib/python3.11/site-packages/amqp/transport.py", line 347, in write
self._write(s)
BrokenPipeError: [Errno 32] Broken pipe
[2023-12-20 22:44:36,199: ERROR/MainProcess] Error cleaning up after event loop: BrokenPipeError(32, 'Broken pipe')
Traceback (most recent call last):
File "/opt/venv/lib/python3.11/site-packages/celery/worker/loops.py", line 97, in asynloop
next(loop)
File "/opt/venv/lib/python3.11/site-packages/kombu/asynchronous/hub.py", line 306, in create_loop
item()
File "/opt/venv/lib/python3.11/site-packages/vine/promises.py", line 161, in __call__
return self.throw()
^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/vine/promises.py", line 158, in __call__
retval = fun(*final_args, **final_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 131, in ack_log_error
self.ack(multiple=multiple)
File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 126, in ack
self.channel.basic_ack(self.delivery_tag, multiple=multiple)
File "/opt/venv/lib/python3.11/site-packages/amqp/channel.py", line 1407, in basic_ack
return self.send_method(
^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/amqp/abstract_channel.py", line 70, in send_method
conn.frame_writer(1, self.channel_id, sig, args, content)
File "/opt/venv/lib/python3.11/site-packages/amqp/method_framing.py", line 186, in write_frame
write(buffer_store.view[:offset])
File "/opt/venv/lib/python3.11/site-packages/amqp/transport.py", line 347, in write
self._write(s)
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/venv/lib/python3.11/site-packages/celery/worker/loops.py", line 102, in asynloop
hub.reset()
File "/opt/venv/lib/python3.11/site-packages/kombu/asynchronous/hub.py", line 115, in reset
self.close()
File "/opt/venv/lib/python3.11/site-packages/kombu/asynchronous/hub.py", line 274, in close
item()
File "/opt/venv/lib/python3.11/site-packages/vine/promises.py", line 161, in __call__
return self.throw()
^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/vine/promises.py", line 158, in __call__
retval = fun(*final_args, **final_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 131, in ack_log_error
self.ack(multiple=multiple)
File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 126, in ack
self.channel.basic_ack(self.delivery_tag, multiple=multiple)
File "/opt/venv/lib/python3.11/site-packages/amqp/channel.py", line 1407, in basic_ack
return self.send_method(
^^^^^^^^^^^^^^^^^
File "/opt/venv/lib/python3.11/site-packages/amqp/abstract_channel.py", line 70, in send_method
conn.frame_writer(1, self.channel_id, sig, args, content)
File "/opt/venv/lib/python3.11/site-packages/amqp/method_framing.py", line 186, in write_frame
write(buffer_store.view[:offset])
File "/opt/venv/lib/python3.11/site-packages/amqp/transport.py", line 347, in write
self._write(s)
BrokenPipeError: [Errno 32] Broken pipe
[2023-12-20 22:44:36,203: CRITICAL/MainProcess] Retrying to establish a connection to the message broker after a connection loss has been disabled (app.conf.broker_connection_retry_on_startup=False). Shutting down...
also getting these (before) in rabbitmq
2023-12-20 22:42:35.693118+00:00 [error] <0.21747.357> closing AMQP connection <0.21747.357> (172.18.0.3:50020 -> 172.18.0.2:5672):
2023-12-20 22:42:35.693118+00:00 [error] <0.21747.357> missed heartbeats from client, timeout: 60s
2023-12-20 22:44:36.208319+00:00 [info] <0.21763.357> closing AMQP connection <0.21763.357> (172.18.0.3:50026 -> 172.18.0.2:5672, vhost: '/', user: 'guest')
2023-12-20 22:44:36.884389+00:00 [warning] <0.21751.357> closing AMQP connection <0.21751.357> (172.18.0.3:50038 -> 172.18.0.2:5672, vhost: '/', user: 'guest'):
2023-12-20 22:44:36.884389+00:00 [warning] <0.21751.357> client unexpectedly closed TCP connection
fspoettel commented
did this happen once or is this a repeated occurence? I wonder if we can make the system more robust against this if it happens occasionally by self-healing.
msramalho commented
this has been recurring, what self-healing approaches do you see?