sensu/sensu-puppet

api_host is a single point of failure when using a cluster

maxadamo opened this issue · 4 comments

Description of problem

  • upgrade module version from 4.x to 5.6
  • it tries to use the API, but it's one API Host only in the cluster (a String only is accepted) and if I reboot that sensu server, or stop the service puppet will fail on all the nodes.
  • I was expecting a mechanism to try another host if the first is failing, in addition to fqdn_rotate to balance the load

On one of my instances I have setup a virtualhost on haproxy, but I was wondering if we can optionally use an Array of hosts.

I think an array of hosts might work since there are simple ways I believe with Sensu Go API to check that the API host you request is responsive, we use that for some of the code already. I think the HAProxy approach makes the most sense if that fits into your infrastructure but if that doesn't work we can investigate supporting an array of hosts for api_host.

@treydock I mention a related issue here (if necessary I can raise another one):

  • if the firewall port to the API is closed, and validate_namespace is enabled, the type agent_entity_config (I think it was this one.....: I can check in the history of the puppetboard) is always executed prior than iptables module, and puppet fails. I had to reset the firewall on each VM to get puppet running.
    Then, I declared the namespace, and I have set validate_namespace to false, and the problem was gone.

@maxadamo For the firewall issue, go ahead and open a different issue please.

Adding the wontfix label because adding a rotate method instead of using a load balancer is not a good architecture decision and it adds extra complexity to the module. Rotating between servers, much like a host does with its DNS servers, is not ideal as you have issues with timeouts and traffic distribution which are easily handled by a load balancer.