See lots of 'Error updating Consul health check: "500"' messages in the logs
cap10morgan opened this issue · 8 comments
I'm using the 0.6.1 release w/ Consul 0.7.1 and RabbitMQ 3.6.5. The cluster starts up OK, but then I see lots of messages like this in the RabbitMQ logs:
=ERROR REPORT==== 30-Nov-2016::17:34:10 ===
autocluster: Error updating Consul health check: "500"
I'm not sure how concerned I should be by that, things generally seem to work. But it doesn't fill me with confidence either.
My guess is you're getting a 500 error back from consul. I'd check the consul logs to see what it has to say.
Errors from Consul log look like this:
2016/11/30 17:36:55 [ERR] http: Request GET /v1/agent/check/pass/service%3Arabbitmq, error: CheckID "service:rabbitmq" does not have associated TTL from=172.17.0.2:34919
Sorry for the delay, what version of Consul?
0.7.1
I could reproduce the same bug by stopping the consul node while the containers are running:
2017/03/01 18:05:28 [ERR] http: Request GET /v1/agent/check/pass/service%3Arabbitmq-test1%3A192.168.1.76, error: CheckID "service:rabbitmq-test1:192.168.1.76" does not have associated TTL from=172.20.0.3:49147
2017/03/01 18:05:28 [DEBUG] http: Request GET /v1/agent/check/pass/service%3Arabbitmq-test1%3A192.168.1.76 (247.449µs) from=172.20.0.3:49147
2017/03/01 18:05:31 [ERR] http: Request GET /v1/agent/check/pass/service%3Arabbitmq-test1%3A192.168.1.76, error: CheckID "service:rabbitmq-test1:192.168.1.76" does not have associated TTL from=172.20.0.2:52632
2017/03/01 18:05:31 [DEBUG] http: Request GET /v1/agent/check/pass/service%3Arabbitmq-test1%3A192.168.1.76 (338.155µs) from=172.20.0.2:52632
2017/03/01 18:05:43 [ERR] http: Request GET /v1/agent/check/pass/service%3Arabbitmq-test1%3A192.168.1.76, error: CheckID "service:rabbitmq-test1:192.168.1.76" does not have associated TTL from=172.20.0.3:51063
2017/03/01 18:05:43 [DEBUG] http: Request GET /v1/agent/check/pass/service%3Arabbitmq-test1%3A192.168.1.76 (75.139µs) from=172.20.0.3:51063
2017/03/01 18:05:46 [ERR] http: Request GET /v1/agent/check/pass/service%3Arabbitmq-test1%3A192.168.1.76, error: CheckID "service:rabbitmq-test1:192.168.1.76" does not have associated TTL from=172.20.0.2:36984
2017/03/01 18:05:46 [DEBUG] http: Request GET /v1/agent/check/pass/service%3Arabbitmq-test1%3A192.168.1.76 (225.301µs) from=172.20.0.2:36984
2017/03/01 18:05:58 [ERR] http: Request GET /v1/agent/check/pass/service%3Arabbitmq-test1%3A192.168.1.76, error: CheckID "service:rabbitmq-test1:192.168.1.76" does not have associated TTL from=172.20.0.3:39587
2017/03/01 18:05:58 [DEBUG] http: Request GET /v1/agent/check/pass/service%3Arabbitmq-test1%3A192.168.1.76 (105.067µs) from=172.20.0.3:39587
2017/03/01 18:06:01 [ERR] http: Request GET /v1/agent/check/pass/service%3Arabbitmq-test1%3A192.168.1.76, error: CheckID "service:rabbitmq-test1:192.168.1.76" does not have associated TTL
and RabbitMQ side:
rabbit_node_1_1 | =ERROR REPORT==== 1-Mar-2017::17:06:01 ===
rabbit_node_1_1 | autocluster: Error updating Consul health check: "500"
rabbit_node_2_1 |
rabbit_node_2_1 | =ERROR REPORT==== 1-Mar-2017::17:06:13 ===
rabbit_node_2_1 | autocluster: Error updating Consul health check: "500"
rabbit_node_1_1 |
rabbit_node_1_1 | =ERROR REPORT==== 1-Mar-2017::17:06:16 ===
rabbit_node_1_1 | autocluster: Error updating Consul health check: "500"
rabbit_node_2_1 |
rabbit_node_2_1 | =ERROR REPORT==== 1-Mar-2017::17:06:28 ===
rabbit_node_2_1 | autocluster: Error updating Consul health check: "500"
rabbit_node_1_1 |
rabbit_node_1_1 | =ERROR REPORT==== 1-Mar-2017::17:06:31 ===
rabbit_node_1_1 | autocluster: Error updating Consul health check: "500"
rabbit_node_2_1 |
rabbit_node_2_1 | =ERROR REPORT==== 1-Mar-2017::17:06:43 ===
rabbit_node_2_1 | autocluster: Error updating Consul health check: "500"
rabbit_node_1_1 |
rabbit_node_1_1 | =ERROR REPORT==== 1-Mar-2017::17:06:46 ===
rabbit_node_1_1 | autocluster: Error updating Consul health check: "500"
rabbit_node_2_1 |
rabbit_node_2_1 | =ERROR REPORT==== 1-Mar-2017::17:06:58 ===
rabbit_node_2_1 | autocluster: Error updating Consul health check: "500"
rabbit_node_1_1 |
rabbit_node_1_1 | =ERROR REPORT==== 1-Mar-2017::17:07:01 ===
rabbit_node_1_1 | autocluster: Error updating Consul health check: "500"
could be the same situation @cap10morgan ?
@Gsantomaggio Not in my case, no. Consul is running the entire time.
Addressed in team RabbitMQ's fork: rabbitmq#2.
This plugin was forked by the RabbitMQ team and is now part of RabbitMQ. More information can be found @ https://github.com/rabbitmq/rabbitmq-autocluster