d2iq-archive/marathon-lb

marathon-lb stop sync with marathon and I do not know why.

Opened this issue · 0 comments

I am running an image of scrappinghub/splash used for web scrapping that is unstable. And I am using health checks on marathon to keep all the eleven containers alive. I even had to set Backoff Factor, Backof and Max Launch Delay to 1 second because of the instability.

after one day running I had to restart the marathon-lb container because it somehow lost its hooks to marathon.

I attached one session to marathon-lb container and reset all the containers on marathon and it didn't reload configs nor one single message was print.

Then i restarted the container and all was fine.
I took a look on the logs and it aborted mid update and then it was stuck.

Although the live check returned and the haproxy?stats was there but it was reporting to a set of containers that were already killed

You can see below that it lists 3 out of 11 servers and it was stuck mid sentence in the fourth one until it receives the docker restart that I presume is the responsible for the ctrl c in the logs and then it resumes as nothing had happen

(SORRY FOR THE LOG DUMP)

2018-05-08 09:40:33,708 marathon_lb: GET http://10.4.0.116:8080/v2/apps?embed=apps.tasks
2018-05-08 09:40:33,710 marathon_lb: got apps ['/splash/splash']
2018-05-08 09:40:33,710 marathon_lb: generating config
2018-05-08 09:40:33,711 marathon_lb: HAProxy dir is /marathon-lb
2018-05-08 09:40:33,711 marathon_lb: configuring app /splash/splash
2018-05-08 09:40:33,711 marathon_lb: frontend at *:80 with backend splash_splash_80
2018-05-08 09:40:33,711 marathon_lb: adding virtual host for app with hostname splash.pt.internal
2018-05-08 09:40:33,711 marathon_lb: adding virtual host for app with id /splash/splash
2018-05-08 09:40:33,711 marathon_lb: backend server 10.4.0.116:31033 on cbrsrvbld13
2018-05-08 09:40:33,711 marathon_lb: backend server 10.4.0.116:31159 on cbrsrvbld13
2018-05-08 09:40:33,711 marathon_lb: backend server 10.4.0.116:31249 on cbrsrvbld13
2018-05-08 09:40:33,712 marathon_lb: backend serve^C
[WARNING] 127/094039 (59) : Former worker 13321 exited with code 0
[WARNING] 127/094039 (59) : Former worker 13341 exited with code 0
[WARNING] 127/094039 (59) : Exiting Master process...
[/marathon-lb /marathon-lb/run] 9090 > /marathon-lb/service/haproxy/env/PORTS
Generating RSA private key, 2048 bit long modulus
........................................................................................+++
............................+++
e is 65537 (0x010001)
Signature ok
subject=CN = *
Getting Private key
[/marathon-lb /marathon-lb/run] Created /marathon-lb/service/lb/run with contents:
[/marathon-lb /marathon-lb/run] #!/bin/sh
exec 2>&1
sv status /marathon-lb/service/haproxy || exit 1
cd /marathon-lb
exec /marathon-lb/marathon_lb.py     --syslog-socket /dev/null     --haproxy-config /marathon-lb/haproxy.cfg     --ssl-certs "/etc/ssl/cert.pem"     --command "/marathon-lb/reload_haproxy.sh"     --sse --group external --marathon http://10.4.0.116:8080 --skip-validation --haproxy-map
run: /marathon-lb/service/haproxy: (pid 29) 0s
[WARNING] 127/105902 (29) : parsing [/marathon-lb/haproxy.cfg:84] : 'timeout client' will be ignored because backend 'splash_splash_80' has no frontend capability
[WARNING] 127/105902 (29) : In backend 'splash_splash_80' (id: 8): server name mismatch: from server state file: 'cbrsrvbld13_10_4_0_116_31159', from running config 'cbrsrvbld13_10_4_0_116_31249'
[WARNING] 127/105902 (29) : In backend 'splash_splash_80' (id: '8'): server ID mismatch: from server state file: '3', from running config 2
[WARNING] 127/105902 (29) : In backend 'splash_splash_80' (id: '8'): server ID mismatch: from server state file: '4', from running config 3
[WARNING] 127/105902 (29) : In backend 'splash_splash_80' (id: '8'): server ID mismatch: from server state file: '5', from running config 4
[WARNING] 127/105902 (29) : In backend 'splash_splash_80' (id: '8'): server ID mismatch: from server state file: '6', from running config 5
[WARNING] 127/105902 (29) : In backend 'splash_splash_80' (id: '8'): server ID mismatch: from server state file: '7', from running config 6
[WARNING] 127/105902 (29) : In backend 'splash_splash_80' (id: '8'): server ID mismatch: from server state file: '8', from running config 7
[WARNING] 127/105902 (29) : In backend 'splash_splash_80' (id: '8'): server ID mismatch: from server state file: '9', from running config 8
[WARNING] 127/105902 (29) : In backend 'splash_splash_80' (id: '8'): server ID mismatch: from server state file: '10', from running config 9
[WARNING] 127/105902 (29) : In backend 'splash_splash_80' (id: '8'): server ID mismatch: from server state file: '11', from running config 10
[WARNING] 127/105902 (29) : In backend 'splash_splash_10000' (id: 9): server name mismatch: from server state file: 'cbrsrvbld13_10_4_0_116_31160', from running config 'cbrsrvbld13_10_4_0_116_31250'
[WARNING] 127/105902 (29) : In backend 'splash_splash_10000' (id: '9'): server ID mismatch: from server state file: '3', from running config 2
[WARNING] 127/105902 (29) : In backend 'splash_splash_10000' (id: '9'): server ID mismatch: from server state file: '4', from running config 3
[WARNING] 127/105902 (29) : In backend 'splash_splash_10000' (id: '9'): server ID mismatch: from server state file: '5', from running config 4
[WARNING] 127/105902 (29) : In backend 'splash_splash_10000' (id: '9'): server ID mismatch: from server state file: '6', from running config 5
[WARNING] 127/105902 (29) : In backend 'splash_splash_10000' (id: '9'): server ID mismatch: from server state file: '7', from running config 6
[WARNING] 127/105902 (29) : In backend 'splash_splash_10000' (id: '9'): server ID mismatch: from server state file: '8', from running config 7
[WARNING] 127/105902 (29) : In backend 'splash_splash_10000' (id: '9'): server ID mismatch: from server state file: '9', from running config 8
[WARNING] 127/105902 (29) : In backend 'splash_splash_10000' (id: '9'): server ID mismatch: from server state file: '10', from running config 9
[WARNING] 127/105902 (29) : In backend 'splash_splash_10000' (id: '9'): server ID mismatch: from server state file: '11', from running config 10
[WARNING] 127/105902 (29) : Failed to connect to the old process socket '/var/run/haproxy/socket'
[ALERT] 127/105902 (29) : Failed to get the sockets from the old process!
[ALERT] 127/105902 (29) : sendmsg logger #1 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #2 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #1 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #2 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #1 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #2 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #1 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #2 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #1 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #2 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #1 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #2 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #1 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #2 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #1 failed: No such file or directory (errno=2)
[ALERT] 127/105902 (29) : sendmsg logger #2 failed: No such file or directory (errno=2)
2018-05-08 10:59:02,625 marathon_lb: setting default value for HAPROXY_HEAD
2018-05-08 10:59:02,625 marathon_lb: setting default value for HAPROXY_USERLIST_HEAD
2018-05-08 10:59:02,625 marathon_lb: setting default value for HAPROXY_HTTP_FRONTEND_HEAD