api_host is a single point of failure when using a cluster
maxadamo opened this issue · 4 comments
Description of problem
- upgrade module version from 4.x to 5.6
- it tries to use the API, but it's one API Host only in the cluster (a String only is accepted) and if I reboot that sensu server, or stop the service puppet will fail on all the nodes.
- I was expecting a mechanism to try another host if the first is failing, in addition to
fqdn_rotate
to balance the load
On one of my instances I have setup a virtualhost on haproxy, but I was wondering if we can optionally use an Array of hosts.
I think an array of hosts might work since there are simple ways I believe with Sensu Go API to check that the API host you request is responsive, we use that for some of the code already. I think the HAProxy approach makes the most sense if that fits into your infrastructure but if that doesn't work we can investigate supporting an array of hosts for api_host
.
@treydock I mention a related issue here (if necessary I can raise another one):
- if the firewall port to the API is closed, and
validate_namespace
is enabled, the typeagent_entity_config
(I think it was this one.....: I can check in the history of the puppetboard) is always executed prior than iptables module, and puppet fails. I had to reset the firewall on each VM to get puppet running.
Then, I declared the namespace, and I have set validate_namespace to false, and the problem was gone.
Adding the wontfix label because adding a rotate method instead of using a load balancer is not a good architecture decision and it adds extra complexity to the module. Rotating between servers, much like a host does with its DNS servers, is not ideal as you have issues with timeouts and traffic distribution which are easily handled by a load balancer.