Azure/azure-event-hubs-node

No connection handler was found for virtual host '522'.

MatejSkrbis opened this issue · 13 comments

After running receiver for some time, following temporary error is displayed:
No connection handler was found for virtual host '522'.

Is this a thing that library should handle?
If it's not what are the possible error codes and is there any documentation about them and how to handle them?

The sdk should be handling this. What is the name and version of the package that you are using?

I'm currently using "azure-event-hubs": "0.2.8"

Haven't noticed that @azure/event-hubs should be used now.
I'll try it with this latest version.

Yup was just about to say that. It would be easier to provide a resolution if you can provide debug logs, by setting the environment variable as follows

export DEBUG=azure*,rhea*,-rhea:raw,-rhea:message,-azure:amqp-common:datatransformer

I was looking at the source code to figure out why you are seeing this error.
I feel it's this line of code.
If an error is received and the sdk user did not close the receiver then we let the user know about it. However we will make an attempt to reconnect if the error is retryable.
This particular error that you are seeing is a retryable error and the sdk is retrying at an interval of 15 seconds. Usually the service comes back up under 60 seconds (once they are done moving the nodes/vms in the background). So you should see in the debug logs that the sdk is making reconnection attempts every 15 seconds. Usually under 4 or 5 attempts the receiver should be back up.

May be the sdk should not send the error to the error handler, if it is retrying to avoid confusion.

The flip side is, if the receiver is silent for a long time (15 seconds) between reconnect attempts, then the user may think that the receiver is hung, whereas it's actually retrying.
May sending the error with a message that the receiver is retrying would be a better option.

I've tried to install @azure/event-hubs": "1.0.1
Now it seems that sometimes when I call eventHubClient.getPartitionIds(); it throws

condition:"com.microsoft:timeout"
message:"The request with message_id "8015a8c3-144c-4b8d-a98d-c44fff43cdb4" to "$management" endpoint timed out. Please try again later."
name:"ServiceUnavailableError"
retryable:true

I'm not sure if it got anything to do with newer version or is it just some temporary thing.

I think that is a temporary thing. I just ran my tests again and they work fine. If for some reason, the sdk does not receive a response from the service then in the specified time then the request timeout happens. The attempt is made 3 times at an interval of 15 seconds and still if the operation fails then the sdk throws an error.

On a side note, you may want to try event processor host @azure/event-processor-host. It is an efficient message receiver. It does couple of things:

  • scans partitions at regular interval to determine if a receiver should be connected to that partition
  • balances the load of receiving messages from different partitions. It also balances load across different instances of EPH as well. The instances could be within the same process or different processes on the same machine or across different machines.

The user needs to checkpoint the messages to the storage account at regular interval, so that other instances can start their receivers from the checkpointed offset.

Hello @MatejSkrbis - How is it going? Just checking to make sure things are running smoothly.

I've tried newer version of library "@azure/event-hubs": "1.0.1", but there was still 'No connection handler was found for virtual host' error found in logs yesterday.
I will also try to use @azure/event-processor-host, but it may not be suitable for everything I need. I may occasionally need to read from all partitions at once.

Well having the error in the logs is fine. But did you see the sdk trying to reconnect after receiving the error?

I am not 100% sure, but I think it did reconnect. There were generally no obvious problems with receiving messages. So may I just ignore this message in future, when it occurs not to spam all over the logs? Are there any other messages I can ignore and what are the error codes?

If i would keep everything non vital in exception logs then logs wouldn't be as useful anymore for detecting other more serious problems.

I will be sending a PR tomorrow, to not bubble up retryable errors to the user if the sdk is retrying. This should avoid any confusion.

@MatejSkrbis - New version of @azure/event-hubs: 1.0.3 has been published which fixes this issue.