pma/amqp

"second 'channel.open' seen" error when recovering from consumer crash

corben2 opened this issue · 2 comments

When a consumer is actively consuming messages, crashes, and then restarts, the following errors seem to occur:

** (stop) :unexpected_delivery_and_no_default_consumer
Last message: {:consumer_call, {:"basic.deliver", "amq.ctag-GdH1Icq6n6jamBqFqBKZdg", 31, false, "reconnect", "TraceId.#"}, {:amqp_msg, {:P_basic, :undefined, :undefined, :undefined, 2, :undefined, :undefined, :undefined, :undefined, :undefined, :undefined, :undefined, :undefined, :undefined, :undefined}, "spam"}}

14:13:05.066 [info] AMQP channel is gone (sub_chan). Reopening...
2024-07-30T14:13:05.066309+00:00 'Elixir.AMQP.Application.Channel':handle_info/2:153 <0.969.0> [info] AMQP channel is gone (sub_chan). Reopening...

14:13:05.066 [info] starting SelectiveConsumer
2024-07-30 14:13:05.067559+00:00 [error] <0.874.0> Error on AMQP connection <0.874.0> (10.89.1.70:50544 -> 10.89.1.69:5672, vhost: '/', user: 'guest', state: running), channel 1:
2024-07-30 14:13:05.067559+00:00 [error] <0.874.0>  operation channel.open caused a connection exception channel_error: "second 'channel.open' seen"
2024-07-30T14:13:05.068800+00:00 : <0.981.0> [warning] Connection (<0.981.0>) closing: received hard error {'connection.close',504,<<"CHANNEL_ERROR - second 'channel.open' seen">>,20,10} from server

The unexpected_delivery_and_no_default_consumer is expected, I think, but the "second 'channel.open' seen" is not. This causes the connection to close, which breaks all other channels using that connection.

ono commented

Interesting. Is the queue defined as exclusive?

I guess this is what is happening...

  • the channel process is gone but it is still valid on the server side
  • when amqp tries to open a new channel for the queue, the server returns the "second 'channel.open' seen" error because it thinks there is another channel opened
  • then connection error and amqp_client (erlang lib) fails

Can you open other connections for the other queues then it won't affect other open channels?

We've worked around this by wrapping all of our message consumption in a try-catch. Otherwise, even if the consumer came back properly, I think it would just repeatedly crash on the same message.

The queue is not exclusive.

I believe I originally tried opening other connections (to the same server, however) for the other channels, and it still had the same behavior. I can retry that and get back to you, but like I said above - even if this is fixed - I think there might still be other issues, so maybe we can just close this issue.