Question about consul service registration with regards to ports and docker
Opened this issue · 10 comments
i currently have an app that is running in a docker container using mesos scheduled with marathon, along with the mesos-consul bridge.
Current marathon app configuration is using bridge networking and allowing mesos/marathon to select whatever port that is available for the host port, but the docker container itself is bound to 8080:
{
"container": {
"type": "DOCKER",
"docker": {
"image": "sarlindo/wildfly-app",
"network": "BRIDGE",
"portMappings": [
{ "containerPort": 8080, "hostPort": 0, "servicePort": 0, "protocol": "tcp" }
]
}
},
"id": "wildfly",
"cmd": "/opt/jboss/wildfly/bin/standalone.sh -b 0.0.0.0 -bmanagement=0.0.0.0",
"instances": 1,
"cpus": 0.3,
"mem": 256
}
Now, when this service gets registered with consul by the mesos-consul bridge, I see it being registered to the following ip/port.
172.17.0.4:31657
Now the ip here is the internal docker ip and not the host and the port number is the host port that mesos/marathon assigned.
The issue now is I can't get to this service because inside the docker container the port is actually 8080.
Is this the way this is suppose to work? Or am I doing something wrong here?
Are you using the default mesos-ip-order
? Is docker
included in the list? I haven't seen a use for having docker
in the search order since it returns the docker IP address which isn't particularly useful as far as I can tell. If it is in the search list, try removing it.
The port is probably correct. Mesos will assign a random port to the docker container and map from 31657->8080.
Yes the port is correct, it's just the IP chosen and registered with consul was the docker IP address. I am running the mesos-consul with defaults. The following is the marathon json i am using to run the mesos-consul bridge.
{
"container": {
"type": "DOCKER",
"docker": {
"image": "ciscocloud/mesos-consul",
"network": "BRIDGE",
"parameters": [
{ "key": "rm", "value": "true" }
]
}
},
"id": "mesos-consul",
"args": ["--zk=zk://192.168.33.10:2181/mesos"],
"instances": 1,
"cpus": 0.1,
"mem": 256,
"constraints": [["hostname", "CLUSTER", "node1"]]
}
Hmm...I can't reproduce...Can you post the task section from the Mesos master? /master/state.json
from the Mesos leader
Here you go below, I think I may be bumping into this issue d2iq-archive/mesos-dns#334 (I know it says mesos-dns, but if you follow the thread, I believe someone is pointing to mesos as the potential issue, but I will have to dig some more) :
{
"executor_id": "",
"framework_id": "13742ebd-7985-4898-b01e-6587d19b885d-0001",
"id": "wildfly.88156cb6-925c-11e5-b212-02429beb943f",
"name": "wildfly",
"resources": {
"cpus": 0.3,
"disk": 0,
"mem": 256.0,
"ports": "[31268-31268]"
},
"slave_id": "a8f46f83-034d-459b-ac0e-e2effd094e4f-S1",
"state": "TASK_RUNNING",
"statuses": [
{
"container_status": {
"network_infos": [
{
"ip_address": "172.17.0.3"
}
]
},
"labels": [
{
"key": "Docker.NetworkSettings.IPAddress",
"value": "172.17.0.3"
}
],
"state": "TASK_RUNNING",
"timestamp": 1448336183.15899
}
]
},
That is exactly what you're running into. Ugh. The default search order is netinfo,mesos,host
so it's using the ip address in the network_infos
block. A workaround is to add "--mesos-ip-order=mesos,host"
to your marathon job for mesos-consul.
@ChrisAubuchon I have actually been trying this, but now it seems mesos-consul bridge won't even register any new services with consul, I created a new service in marathon and when I go to the consul ui it doesn't register anything now?
This is now my new marathon json for mesos-consul
{
"container": {
"type": "DOCKER",
"docker": {
"image": "ciscocloud/mesos-consul",
"network": "BRIDGE",
"parameters": [
{ "key": "rm", "value": "true" }
]
}
},
"id": "mesos-consul",
"args": ["--zk=zk://192.168.33.10:2181/mesos --mesos-ip-order=mesos,host"],
"instances": 1,
"cpus": 0.1,
"mem": 256,
"constraints": [["hostname", "CLUSTER", "node1"]]
}
These are the logs that I see for the mesos-consul bridge:
vagrant@node1:~/projects/consul$ sudo docker logs 5cd4ec4464d7
2015/11/24 16:26:45 Connected to 192.168.33.10:2181
2015/11/24 16:26:45 Authenticated: id=94921046598942733, timeout=40000
Any clue as to why adding this new flag would cause issues?
The command line arguments in the args
list need to be separated:
"args": [
"--zk=zk://192.168.33.10:2181/mesos",
"--mesos-ip-order=mesos,host"
],
oops! it's now working. thanks Chris.
Out of curiosity, do you work for cisco? what does cisco the company have to do with these projects?
Mesos-consul was developed as part of Cisco's Mantl project
thanks