python/psf-salt

Circular dependency between salt-master and consul VMs

zware opened this issue · 3 comments

zware commented

Steps to reproduce:
In a fresh clone of psf-salt, run

vagrant up salt-master

Hang occurs at:

==> salt-master: [INFO    ] Executing command 'consul-template -config /etc/consul-template.d -once' in directory '/root' 

/var/log/consul.log on salt-master looks like:

Sep 25 04:14:11 salt-master consul[6272]: serf: EventMemberJoin: salt-master 192.168.50.2
Sep 25 04:14:11 salt-master consul[6305]: serf: EventMemberJoin: salt-master 192.168.50.2
Sep 25 04:14:11 salt-master consul[6305]: serf: Failed to re-join any previously known node
Sep 25 04:14:11 salt-master consul[6305]: agent: failed to sync remote state: No known Consul servers
Sep 25 04:14:35 salt-master consul[6305]: http: Request /v1/health/service/graphite?dc=vagrant&passing=1&wait=60000ms, error: No known Consul servers
Sep 25 04:14:40 salt-master consul[6305]: agent: failed to sync remote state: No known Consul servers
Sep 25 04:14:40 salt-master consul[6305]: http: Request /v1/health/service/graphite?dc=vagrant&passing=1&wait=60000ms, error: No known Consul servers
Sep 25 04:15:00 salt-master consul[6305]: message repeated 4 times: [ http: Request /v1/health/service/graphite?dc=vagrant&passing=1&wait=60000ms, error: No known Consul servers]
...

After Ctrl+C (twice) to escape the 'vagrant up' command, vagrant up consul will complete, but with a failure in the salt state 'consul-template':

==> consul:           ID: consul-template                                                                                                                                                                                                                                                          [76/1951]
==> consul:     Function: cmd.wait
==> consul:         Name: consul-template -config /etc/consul-template.d -once
==> consul:       Result: False
==> consul:      Comment: Command "consul-template -config /etc/consul-template.d -once" run
==> consul:      Started: 04:22:14.255200
==> consul:     Duration: 18.716 ms
==> consul:      Changes:   
==> consul:               ----------
==> consul:               pid:
==> consul:                   4294
==> consul:               retcode:
==> consul:                   15
==> consul:               stderr:
==> consul:                   2015/09/25 04:22:14 [ERR] (runner) error running command: exit status 1
==> consul:                   Consul Template returned errors:
==> consul:                   1 error(s) occurred:
==> consul:                   
==> consul:                   * exit status 1
==> consul:               stdout:
==> consul: 
==> consul: Summary
==> consul: -------------
==> consul: Succeeded: 46 (changed=35)
==> consul: Failed:     1
==> consul: -------------
==> consul: Total states run:     47

Attempting to up another VM (such as 'speed-web') after vagrant up consul behaves similarly to upping 'consul' (vagrant up completes, but 'consul-template' fails). Attempting to up speed-web without attempting to up consul results in the same hang salt-master experiences.

zware commented

Update:
I think this comment may be a red herring; my ubuntu/trusty64 box is very slightly out of date, running apt-get upgrade on salt-master resolved this by updating python-requests.

Original comment:
Also, I just found this repeated a few times in salt-master's /var/log/salt/master:

2015-09-25 04:47:55,562 [salt.pillar      ][ERROR   ] Failed to load ext_pillar consul: request() got an unexpected keyword argument 'json'
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/salt/pillar/__init__.py", line 523, in ext_pillar
    key)
  File "/usr/local/lib/python2.7/dist-packages/salt/pillar/__init__.py", line 484, in _external_pillar_data
    ext = self.ext_pillars[key](self.opts['id'], pillar, **val)
  File "/srv/salt/_extensions/pillar/consul.py", line 124, in ext_pillar
    CONSUL_ACL,
  File "/srv/salt/_extensions/modules/consul.py", line 115, in create_acl
    params={"token": token},
  File "/usr/lib/python2.7/dist-packages/requests/api.py", line 99, in put
    return request('put', url, data=data, **kwargs)
  File "/usr/lib/python2.7/dist-packages/requests/api.py", line 44, in request
    return session.request(method=method, url=url, **kwargs)
TypeError: request() got an unexpected keyword argument 'json'

Since both salt and consul are involved in that traceback, I assume it's related, but I have no idea how or whose bug it would be. Unfortunately I'm also not sure at which point the tracebacks were generated (after the initial up salt-master, I tried several variations on vagrant reload <vm> where was salt-master or consul, with and without the '--provision' flag to vagrant reload).

zware commented

Workaround to get speed-web running:

vagrant up salt-master # wait for hang, ^C^C
vagrant up consul
vagrant reload --provision salt-master
vagrant ssh salt-master -c "sudo service salt-master restart && sudo salt-call state.highstate"
vagrant up speed-web

Step 4 may not be necessary with an up-to-date ubuntu/trusty64 box; I also had to do apt-get upgrade on salt-master (and also did it on speed-web, which may or may not have been necessary).

This is resolved with recent work to update our local dev to use Docker! #245