mesos/kafka

Brokers don't start

deric opened this issue · 9 comments

deric commented

I've followed the examples, but kafka-mesos isn't launching any tasks on Mesos:

./kafka-mesos.sh broker list
brokers:
  id: 0
  active: true
  state: starting
  resources: cpus:1.00, mem:2048, heap:1024, port:auto
  failover: delay:1m, max-delay:10m
  stickiness: period:10m

  id: 1
  active: true
  state: starting
  resources: cpus:1.00, mem:2048, heap:1024, port:auto
  failover: delay:1m, max-delay:10m
  stickiness: period:10m

  id: 2
  active: true
  state: starting
  resources: cpus:1.00, mem:2048, heap:1024, port:auto
  failover: delay:1m, max-delay:10m
  stickiness: period:10m

Scheduler returns follwing: badMessage: 400 for HttpChannelOverHttp@21e29795{r=0,a=IDLE,uri=-}WrappedArray()

The only strange message in logs:

2016-07-15 11:31:10,753 [Jetty-63] WARN  org.eclipse.jetty.http.HttpParser  - badMessage: 400 for HttpChannelOverHttp@21e29795{r=0,a=IDLE,uri=-}WrappedArray()
2016-07-15 11:32:36,533 [Jetty-64] INFO  ly.stealth.mesos.kafka.HttpServer$  - handling - http://svc01:7000/api/broker/add
2016-07-15 11:32:36,570 [Jetty-64] INFO  ly.stealth.mesos.kafka.HttpServer$  - finished handling
2016-07-15 11:33:24,340 [Jetty-64] INFO  ly.stealth.mesos.kafka.HttpServer$  - handling - http://svc01:7000/api/broker/list
2016-07-15 11:33:24,344 [Jetty-64] INFO  ly.stealth.mesos.kafka.HttpServer$  - finished handling
2016-07-15 11:33:33,832 [Jetty-66] WARN  org.eclipse.jetty.http.HttpParser  - badMessage: 400 for HttpChannelOverHttp@6d440fc{r=0,a=IDLE,uri=-}WrappedArray()
2016-07-15 11:33:55,235 [Jetty-66] WARN  org.eclipse.jetty.http.HttpParser  - badMessage: 400 for HttpChannelOverHttp@3e36cb{r=0,a=IDLE,uri=-}WrappedArray()

I'm using kafka-mesos-0.9.5.1.jar.

deric commented

The problem was that I was running mesos-kafka in Docker container in BRIDGE mode. Then framework tries to bind to a random on Docker private net. If anyone is interested, here's my Marathon config:

{
  "id": "/service/kafka",
  "cmd": "./kafka-mesos.sh scheduler --master=zk://192.168.1.1:2181/mesos --zk=192.168.1.1:2181 --api=http://svc01:7000 --storage=zk:/kafka-mesos --debug=true",
  "cpus": 0.5,
  "mem": 256,
  "disk": 0,
  "instances": 1,
  "container": {
    "type": "DOCKER",
    "volumes": [],
    "docker": {
      "image": "kafka-mesos:1.0.0",
      "network": "BRIDGE",
      "portMappings": [
        {
          "containerPort": 7000,
          "hostPort": 0,
          "servicePort": 10079,
          "protocol": "tcp",
          "labels": {}
        },
        {
          "containerPort": 7001,
          "hostPort": 0,
          "servicePort": 10080,
          "protocol": "tcp",
          "labels": {}
        }
      ],
      "privileged": false,
    }
  },
  "env": {
    "LIBPROCESS_PORT": "7001" // communication with mesos master
  },
  "labels": {
    "HAPROXY_0_PORT": "7000", // public http API
  },
}

then we need to patch kafka-mesos.sh:

if [ -n "${HOST}" ]; then
  export LIBPROCESS_ADVERTISE_IP=$(getent hosts ${HOST} | awk '{ print $1 }')
  export LIBPROCESS_ADVERTISE_PORT=${PORT1}
fi

which ensures that some random ports assigned by Marathon will be mapped to Docker:

0.0.0.0:36279->7000/tcp, 0.0.0.0:36280->7001/tcp

This was mesos-kafka will advertise Mesos agent's IP and port 36280 (in this case). But inside container API will bind to 7000 and libprocess will bind to 7001.

Something related to this issue: mesosphere/marathon#4216

CBR09 commented

Hi @deric, I've tried that and kafka scheduler has registered with mesos, but when I started broker, my broker can't download jar from kafka scheduler due to api port wrong (it still 7000).

Downloading resource from 'http://x.x.x.x:7000/jar/kafka-mesos-0.10.0.0-rc1-kafka_2.10-0.10.1.1.jar' to '/var/tmp/mesos/slaves/95d866e0-9f6d-4447-b62f-5c6657cb1589-S1/frameworks/95d866e0-9f6d-4447-b62f-5c6657cb1589-0018/executors/kafka-0-7e858a8d-1424-4b01-acdd-7bc505c90806/runs/5aee8349-8d64-40a1-a6eb-06938bd9341d/kafka-mesos-0.10.0.0-rc1-kafka_2.10-0.10.1.1.jar'
Failed to fetch 'http://x.x.x.x:7000/jar/kafka-mesos-0.10.0.0-rc1-kafka_2.10-0.10.1.1.jar': Error downloading resource: Couldn't connect to server
Failed to synchronize with agent (it's probably exited)

port mapping
0.0.0.0:31960->7000/tcp, 0.0.0.0:31961->7001/tcp

How can I advertise api port like libprocess port?
Thanks

deric commented

@CBR09 We use marathon-lb (haproxy) for accessing apps running in Mesos cluster. But any service discovery (like mesos-dns) could be used.

CBR09 commented

Thank @deric : so if I use any service discovery (mesos-dns or marathon-lb), does my broker will find kafka-scheduler?, can you please explain how it work ?

deric commented

@CBR09 Yeah, that should do it. Just make sure you'll pass flag --api=http://x.x.x.x:7000 to the scheduler. kafka-mesos then acts as Mesos framework and you can interact with over the API.

CBR09 commented

@deric I still confused, when I use mesos-dns, I can use command line to resolving domain name to the IP of mesos slave on which kafka-scheduler is running, and the exposed port (31960 in this case), but I don't know how to use this in marathon json?. I don't know exposed port unless I define it in hostPort field.

So, if I understand what you're asking, there is currently no way to do it. There's no support in the scheduler to advertise one port (for resource downloading for example) while listening on another.

Support for NAT (or PAT in your case) would be nice to have, and I'd love to see a pull request adding it. :)

CBR09 commented

Yes, that's what I want to ask, thank you for your confirmation. If I can, I will