mesos/kafka

Setting LIBPROCESS_IP

Closed this issue · 7 comments

When running this framework on marathon (as suggested in the readme in the docker folder of the repo) the framework doesn't receive offers in my setup. I believe this is because when there are multiple network interfaces, LIBPROCESS_IP must be set so that libmesos knows which IP to advertise.

When constraining the docker image to run on a particular slave, and setting LIBPROCESS_IP to the slaves IP. The framework receives offers and works correctly.

How can I make this work without constraining the slave that is used? Is there a LIBPROCESS_HOST variable? Because the docker container is launched with the salves hostname defined in an environment variable, but not the IP address.

you can bind-address the scheduler https://github.com/mesos/kafka#scheduler-configuration and you can also export the env https://github.com/mesos/kafka#environment-configuration which can be done if launching the scheduler on marathon via its json, np ( "env": "blah" inside "app" structure)

Sure, I know that... But the tricky part is knowing the IP address of the slave that the container is launched on. I ended up finding this issue open against mesos https://issues.apache.org/jira/browse/MESOS-3740 which has apparently been addressed in dcos but not open source mesos.

In the mean time the solution I ended up going with is mounting the mesos slave config file where the slave ip is defined inside the container and reading that.

deric commented

I wrote a workaround for this issue, see #240.

@deric I tried a similar approach, but just put the shell commands in the marathon deacriptor, but wasn't able to make it work consistently across all our environments. I ended up reading a file from the slave that defines the ip.
Also, I found this issue open on the mesos jira https://issues.apache.org/jira/plugins/servlet/mobile#issue/MESOS-3740/comment/15332711

deric commented

@eli-jordan What kind of environments you have? getent should be working anywhere with glibc library, even musl in Alpine Linux should be compatible (although I haven't tested this). Are you sure you were exporting correct IP address? I've tested this on Mesos 0.28.2.

@deric I forget exactly what the problem I had was, but it wasn't that getent wasn't available, but that it didn't return the desired ip address

in shell executor mode the LIBPROCESS_IP is available (0.28.2), probably its better to use it without Docker (results in smaller download traffic ;-)), here is how the marathon config looks like, which is working for me:

{
  "id": "/kafka/scheduler",
  "cmd": "LIBPROCESS_PORT=${PORT} java -jar kafka-mesos-0.9.5.1.jar scheduler --master=zk://${ZOOKEEPER}/mesos/mesos-1 --storage=zk:/scheduler --zk=${ZOOKEEPER}/kafka/mesos/1 --api http://${LIBPROCESS_IP}:${PORT1} --user root",
  "cpus": 1,
  "mem": 128,
  "disk": 0,
  "instances": 1,
  "env": {
    "MESOS_NATIVE_JAVA_LIBRARY": "/usr/local/lib/libmesos.so",
    "ZOOKEEPER": "zookeeper-1:2181,zookeeper-2:2181,zookeeper-3:2181"
  },
  "portDefinitions": [
    {
      "port": 10003,
      "protocol": "tcp",
      "name": "libprocess",
      "labels": {}
    },
    {
      "port": 10004,
      "protocol": "tcp",
      "name": "api",
      "labels": {}
    }
  ],
  "uris": [
    "https://github.com/mesos/kafka/releases/download/v0.9.5.1/kafka-mesos-0.9.5.1.jar",
    "https://archive.apache.org/dist/kafka/0.10.0.1/kafka_2.11-0.10.0.1.tgz"
  ],
  "fetch": [
    {
      "uri": "https://github.com/mesos/kafka/releases/download/v0.9.5.1/kafka-mesos-0.9.5.1.jar",
      "extract": false,
      "executable": false,
      "cache": true
    },
    {
      "uri": "https://archive.apache.org/dist/kafka/0.10.0.1/kafka_2.11-0.10.0.1.tgz",
      "extract": false,
      "executable": false,
      "cache": true
    }
  ]
}