bellingcat/whisperbox-transcribe

worker dies from not being able to connect to rabbitmq

msramalho opened this issue · 2 comments

full logs:

[2023-12-20 22:44:36,188: INFO/MainProcess] Task app.worker.main.transcribe[<JOB_ID>] succeeded in 291.4594302158803s: None
[2023-12-20 22:44:36,190: CRITICAL/MainProcess] Couldn't ack 3, reason:ConnectionResetError(104, 'Connection reset by peer')
Traceback (most recent call last):
  File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 131, in ack_log_error
    self.ack(multiple=multiple)
  File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 126, in ack
    self.channel.basic_ack(self.delivery_tag, multiple=multiple)
  File "/opt/venv/lib/python3.11/site-packages/amqp/channel.py", line 1407, in basic_ack
    return self.send_method(
           ^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/amqp/abstract_channel.py", line 70, in send_method
    conn.frame_writer(1, self.channel_id, sig, args, content)
  File "/opt/venv/lib/python3.11/site-packages/amqp/method_framing.py", line 186, in write_frame
    write(buffer_store.view[:offset])
  File "/opt/venv/lib/python3.11/site-packages/amqp/transport.py", line 347, in write
    self._write(s)
ConnectionResetError: [Errno 104] Connection reset by peer
[2023-12-20 22:44:36,194: CRITICAL/MainProcess] Couldn't ack 2, reason:BrokenPipeError(32, 'Broken pipe')
Traceback (most recent call last):
  File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 131, in ack_log_error
    self.ack(multiple=multiple)
  File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 126, in ack
    self.channel.basic_ack(self.delivery_tag, multiple=multiple)
  File "/opt/venv/lib/python3.11/site-packages/amqp/channel.py", line 1407, in basic_ack
    return self.send_method(
           ^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/amqp/abstract_channel.py", line 70, in send_method
    conn.frame_writer(1, self.channel_id, sig, args, content)
  File "/opt/venv/lib/python3.11/site-packages/amqp/method_framing.py", line 186, in write_frame
    write(buffer_store.view[:offset])
  File "/opt/venv/lib/python3.11/site-packages/amqp/transport.py", line 347, in write
    self._write(s)
BrokenPipeError: [Errno 32] Broken pipe
[2023-12-20 22:44:36,195: CRITICAL/MainProcess] Couldn't ack 5, reason:BrokenPipeError(32, 'Broken pipe')
Traceback (most recent call last):
  File "/opt/venv/lib/python3.11/site-packages/celery/worker/loops.py", line 97, in asynloop
    next(loop)
  File "/opt/venv/lib/python3.11/site-packages/kombu/asynchronous/hub.py", line 306, in create_loop
    item()
  File "/opt/venv/lib/python3.11/site-packages/vine/promises.py", line 161, in __call__
    return self.throw()
           ^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/vine/promises.py", line 158, in __call__
    retval = fun(*final_args, **final_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 131, in ack_log_error
    self.ack(multiple=multiple)
  File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 126, in ack
    self.channel.basic_ack(self.delivery_tag, multiple=multiple)
  File "/opt/venv/lib/python3.11/site-packages/amqp/channel.py", line 1407, in basic_ack
    return self.send_method(
           ^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/amqp/abstract_channel.py", line 70, in send_method
    conn.frame_writer(1, self.channel_id, sig, args, content)
  File "/opt/venv/lib/python3.11/site-packages/amqp/method_framing.py", line 186, in write_frame
    write(buffer_store.view[:offset])
  File "/opt/venv/lib/python3.11/site-packages/amqp/transport.py", line 347, in write
    self._write(s)
BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 131, in ack_log_error
    self.ack(multiple=multiple)
  File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 126, in ack
    self.channel.basic_ack(self.delivery_tag, multiple=multiple)
  File "/opt/venv/lib/python3.11/site-packages/amqp/channel.py", line 1407, in basic_ack
    return self.send_method(
           ^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/amqp/abstract_channel.py", line 70, in send_method
    conn.frame_writer(1, self.channel_id, sig, args, content)
  File "/opt/venv/lib/python3.11/site-packages/amqp/method_framing.py", line 186, in write_frame
    write(buffer_store.view[:offset])
  File "/opt/venv/lib/python3.11/site-packages/amqp/transport.py", line 347, in write
    self._write(s)
BrokenPipeError: [Errno 32] Broken pipe
[2023-12-20 22:44:36,199: ERROR/MainProcess] Error cleaning up after event loop: BrokenPipeError(32, 'Broken pipe')
Traceback (most recent call last):
  File "/opt/venv/lib/python3.11/site-packages/celery/worker/loops.py", line 97, in asynloop
    next(loop)
  File "/opt/venv/lib/python3.11/site-packages/kombu/asynchronous/hub.py", line 306, in create_loop
    item()
  File "/opt/venv/lib/python3.11/site-packages/vine/promises.py", line 161, in __call__
    return self.throw()
           ^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/vine/promises.py", line 158, in __call__
    retval = fun(*final_args, **final_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 131, in ack_log_error
    self.ack(multiple=multiple)
  File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 126, in ack
    self.channel.basic_ack(self.delivery_tag, multiple=multiple)
  File "/opt/venv/lib/python3.11/site-packages/amqp/channel.py", line 1407, in basic_ack
    return self.send_method(
           ^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/amqp/abstract_channel.py", line 70, in send_method
    conn.frame_writer(1, self.channel_id, sig, args, content)
  File "/opt/venv/lib/python3.11/site-packages/amqp/method_framing.py", line 186, in write_frame
    write(buffer_store.view[:offset])
  File "/opt/venv/lib/python3.11/site-packages/amqp/transport.py", line 347, in write
    self._write(s)
BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.11/site-packages/celery/worker/loops.py", line 102, in asynloop
    hub.reset()
  File "/opt/venv/lib/python3.11/site-packages/kombu/asynchronous/hub.py", line 115, in reset
    self.close()
  File "/opt/venv/lib/python3.11/site-packages/kombu/asynchronous/hub.py", line 274, in close
    item()
  File "/opt/venv/lib/python3.11/site-packages/vine/promises.py", line 161, in __call__
    return self.throw()
           ^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/vine/promises.py", line 158, in __call__
    retval = fun(*final_args, **final_kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 131, in ack_log_error
    self.ack(multiple=multiple)
  File "/opt/venv/lib/python3.11/site-packages/kombu/message.py", line 126, in ack
    self.channel.basic_ack(self.delivery_tag, multiple=multiple)
  File "/opt/venv/lib/python3.11/site-packages/amqp/channel.py", line 1407, in basic_ack
    return self.send_method(
           ^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/amqp/abstract_channel.py", line 70, in send_method
    conn.frame_writer(1, self.channel_id, sig, args, content)
  File "/opt/venv/lib/python3.11/site-packages/amqp/method_framing.py", line 186, in write_frame
    write(buffer_store.view[:offset])
  File "/opt/venv/lib/python3.11/site-packages/amqp/transport.py", line 347, in write
    self._write(s)
BrokenPipeError: [Errno 32] Broken pipe
[2023-12-20 22:44:36,203: CRITICAL/MainProcess] Retrying to establish a connection to the message broker after a connection loss has been disabled (app.conf.broker_connection_retry_on_startup=False). Shutting down...

also getting these (before) in rabbitmq

2023-12-20 22:42:35.693118+00:00 [error] <0.21747.357> closing AMQP connection <0.21747.357> (172.18.0.3:50020 -> 172.18.0.2:5672):
2023-12-20 22:42:35.693118+00:00 [error] <0.21747.357> missed heartbeats from client, timeout: 60s
2023-12-20 22:44:36.208319+00:00 [info] <0.21763.357> closing AMQP connection <0.21763.357> (172.18.0.3:50026 -> 172.18.0.2:5672, vhost: '/', user: 'guest')
2023-12-20 22:44:36.884389+00:00 [warning] <0.21751.357> closing AMQP connection <0.21751.357> (172.18.0.3:50038 -> 172.18.0.2:5672, vhost: '/', user: 'guest'):
2023-12-20 22:44:36.884389+00:00 [warning] <0.21751.357> client unexpectedly closed TCP connection

did this happen once or is this a repeated occurence? I wonder if we can make the system more robust against this if it happens occasionally by self-healing.

this has been recurring, what self-healing approaches do you see?