thefactory/cloudformation-mesos

Isolated Instances

Opened this issue · 1 comments

I'm struggling to get a system up. When I spin up a Mesos cluster the masters and slaves are isolated... As a result I suspect and issue with zookeeper.

I've setup 3 functioning zookeeper instances using factory/cloudformation-zk-exhibitor.
Slaves aren't registering with the master and neither are other masters.
Docker containers are dying immediately.
If someone would be able to give me a hand I would greatly appreciate it!

Here's the output of docker logs for the marathon container:

MESOS_NATIVE_JAVA_LIBRARY is not set. Searching in /usr/lib /usr/local/lib.
MESOS_NATIVE_LIBRARY, MESOS_NATIVE_JAVA_LIBRARY set to '/usr/local/lib/libmesos.so'
[2014-11-03 21:08:23,282] INFO Starting Marathon 0.7.1 (mesosphere.marathon.Main$:20)
[scallop] Error: Validation failure for 'zk' option parameters: zk:///mesos_marathon

And here's the output of docker logs for the marathon logger container:

Configuring in-memory store with {'max_length': 100}
HTTPConnectionPool(host='localhost', port=8080): Max retries exceeded with url: /v2/eventSubscriptions?callbackUrl=http%3A%2F%2Flocalhost%3A5000%2Fevents (Caused by <class 'socket.error'>: [Errno 111] Connection refused)
Traceback (most recent call last):
File "/opt/marathon-logger/marathon-logger.py", line 53, in
m.create_event_subscription(args.callback_url)
File "/usr/local/lib/python2.7/dist-packages/marathon/client.py", line 278, in create_event_subscription
return response.json()
AttributeError: 'NoneType' object has no attribute 'json'

Looks like the Mesos servers are failing to talk to Exhibitor. Can you log into a Mesos master or slave and try curling the Exhibitor URL you passed to the stack?