arriven/db1000n

Main db1000n network is unreachable when ovpn container is restarted

palianycia123 opened this issue · 3 comments

I noticed that sometimes ovpn (service of docker-compose app) is restarted - which is ok because there might be some connectivity issue to VPN server. Autoheal container works perfectly here - it restarts ovpn container. But it doesn't restart main db1000n container because it doesn't have health check endpoint.
It results in false positive of db1000n status - container is up and running, but packets are not transmitting.

Expected Behavior

It will be good to add health check end point in main db1000n program which will do e.g.: nslookup google.com and return 200 on success and non-200 on failure. Thus we will be confident in network setup by calling health check endpoint periodically via docker-compose.

Actual Behavior

Network is unreachable in main db1000n container when ovpn container is restarted.

Steps to Reproduce the Problem

  1. docker-compose -f examples/docker/static-docker-compose.yml up -d (to see network is unreachable error in main db1000n container, add LOG_LEVEL:DEBUG environment in static-docker-compose.yml)
  2. Turn host network off - to simulate network connectivity issue.
  3. Observe ovpn container logs and wait until container is restarted. docker logs docker_ovpn_1 -f
2022-05-02 08:45:43 SIGTERM[hard,] received, process exiting
Exiting.
  1. Turn host network on.
  2. Observe that ovpn container logs shows it is connected successfully.
2022-05-02 13:45:07 Initialization Sequence Completed
  1. Observe main program logs
error sending packet	{"error": "dial tcp 146.120.90.38:80: connect: network is unreachable", "args": {"address":"146.120.90.38:80","body":"{{ random_payload 10 }}","connection":{"args":{"address":"146.120.90.38:80","protocol":"tcp","proxy_urls":"","timeout":null},"type":"net"},"interval_ms":1,"packet":{"payload":{"data":{"payload":"{{ random_payload 10 }}"},"type":"raw"}}}}
error sending packet	{"error": "dial tcp 146.120.90.247:443: connect: network is unreachable", "args": {"address":"146.120.90.247:443","body":"{{ random_payload 10 }}","connection":{"args":{"address":"146.120.90.247:443","protocol":"tcp","proxy_urls":"","timeout":null},"type":"net"},"interval_ms":1,"packet":{"payload":{"data":{"payload":"{{ random_payload 10 }}"},"type":"raw"}}}}
error sending packet	{"error": "dial tcp 146.120.90.42:8080: connect: network is unreachable", "args": {"address":"146.120.90.42:8080","body":"{{ random_payload 10 }}","connection":{"args":{"address":"146.120.90.42:8080","protocol":"tcp","proxy_urls":"","timeout":null},"type":"net"},

Note: When I restart main db1000n container manually it resolves network issue.

Specifications

  • Version: v0.8.33
  • Platform: macos and ubuntu
  • Subsystem: docker-compose

I'm considering different ways to do it but it feels like this and #525 could be implemented in the same way. Or rather implementing that one would make it very easy to implement this one

Well, simple app crashing/exiting in case of network issues, might not resolve this issue. According to autheal doc, health check is the dependecy for autoheal container.
Note: You must apply HEALTHCHECK to your docker images first

I'm not completely sure but I assume autoheal would just restart the container if healthcheck fails. Crashing the main process would make the container restart anyway if correct restart policy is provided