Restart/Reconnect containers connected via 'network_mode: service' automatically when main service is restarted

Question

Restart/Reconnect containers connected via 'network_mode: service' automatically when main service is restarted

DavHau opened this issue 6 years ago · 31 comments

Is your feature request related to a problem? Please describe.
When running the following docker-compose.yml:

version: "3.7"

services:
  
  mother:
    image: alpine
    command: "sleep 999999"
    restart: always

  child:
    image: alpine
    command: "sleep 888888"
    network_mode: "service:mother"

If the mother container is restarted for any reason (crash / manual restart), the child container loses its network forever.

$ docker-compose restart mother
$ docker-compose exec child ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever

The child container is fully disconnected from the world. It will not reattach to mother's network. It will be unable to communicate with other containers and the internet. Does it make any sense at all to continue running the child container in this state?

Describe the solution you'd like
Whenever a service is restarted which has other services connected to it via 'network_mode: service', then reconnect those other services or restart them if reconnecting is technically unfeasible.

Describe alternatives you've considered
A workaround using a healthcheck and autoheal is described here: #6329 (comment)

In discussions of other issues related to 'network_mode: service' it is suggested to use a user defined network instead. But as far as i know there are container compositions which require 'network_mode: service', for example when putting multiple containers behind a vpn. Please correct me if I'm wrong.

gionag commented a year ago

+1

Answer 1 · 2019-10-09T19:55:46.000Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Answer 2 · 2019-10-10T07:32:46.000Z

AFAIK this is still an issue.

Answer 3 · 2019-10-10T07:32:48.000Z

This issue has been automatically marked as not stale anymore due to the recent activity.

Answer 4 · 2019-10-10T12:52:48.000Z

when you use "service" network_mode (i.e. sharing network namespace between containers), loosing connectivity on restart is really the expected behaviour. Comparable to using "host" network and getting the node shut down and service restarted elsewhere on cluster.

Such usage only makes sense for highly coupled containers (typically: containers in a kubernetes Pod) but not for services communicating together in a reliable way. Automatically restarting the dependent service would help you hide the networking constraints of your architecture but this is just cheating, better get your architecture to embrace the risk for dependent service being restarted or sacled up/down. For this purpose, use your compose file to define an explicit network connecting services together.

Answer 5 · 2020-04-07T13:31:38.000Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Answer 6 · 2020-04-08T03:39:38.000Z

@ndeloof

when you use "service" network_mode (i.e. sharing network namespace between containers), loosing connectivity on restart is really the expected behaviour.

Intuitively i would not call this expected behaviour. I give you some real world examples: In my home network if my network connection is dependent on a cable being plugged into my machine and i plug this cable out and then back in, I expect my machine to reconnect. Or if my network connection is dependent on some other machine, i.e. my router, and i restart that machine, i expect my network to be back up again after restarting that machine

Comparable to using "host" network and getting the node shut down and service restarted elsewhere on cluster.

I agree to this comparison as it demonstrates how useless such kind of behaviour is. This is why you would never configure your cluster in a way to run a service without a vital resource being present. And therefore i think it would be a good idea to also stop doing that in docker compose. When using "service" network_mode, the services are highly coupled, so that one cannot live without the other one. In a mother child configuration the child is strongly dependent on mother and it never makes sense to have the child running without a mother. There is no single good reason why you would not also stop the child if mother is gone / or cannot reunite with the child after being restarted.

For this purpose, use your compose file to define an explicit network connecting services together.

As i already stated in the original issue, there are some container configurations where creating an explicit network is not sufficient and instead you have to share the network adapter itself. For example forcing any kind of container to connect via a VPN container. Therefore your suggestion doesn't solve the problem.

Answer 7 · 2020-04-08T03:39:41.000Z

This issue has been automatically marked as not stale anymore due to the recent activity.

Answer 8 · 2021-01-01T20:04:04.000Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Answer 9 · 2021-01-09T14:56:07.000Z

This issue has been automatically closed because it had not recent activity during the stale period.

Answer 10 · 2021-04-19T17:27:03.000Z

Amazing this issue still exists

Answer 11 · 2021-08-04T11:39:20.000Z

I agree with @DavHau. There should be at least an option to make this behaviour possible.

Answer 12 · 2021-09-25T17:21:24.000Z

Now that compose is transitioning to v2, maybe it would be worth to check this issue again? @ndeloof

There are quite a lot of usecases where automatically restarting the child's network stack is quite useful, as explained above. As of now, child containers will literally be deprived of all network connectiviy once the container providing the network stack dies.

The healthcheck workaround provided in the first comment is a rather brutal and completely ineffective approach, since child containers do not need to be brutally restarted as the only thing failing is their network stack, not the service itself nor whathever the container provides. Once the mother (or network-providing container) is restarted the network stack from child containers should be restarted/updated as well, thus avoiding brutal, and (hopefully not) taxing and long reloads for important services.

To make things worse I've seen quite a lot of images using healthchecks that only do probes internally. If some container offers some service at localhost:8000, chances are it's just using plain curl -f localhost:8000, which won't fail even if the container providing the network stack fails. This wouldn't be too much of an issue if they did something like curl -f localhost:8000 && curl -f google.com, but I for one don't support the idea of restarting completely fine-working containers just because their network stack malfunctioned for a brief moment.

Answer 13 · 2021-09-27T05:59:28.000Z

if child depends on mother service, like defined here by network_mode (but could also be by any other shared namespace, as well as explicit depends_on) it would make sense to me restarting mother service would restart all the dependent services. (pull requests are welcome on v2 :P)

That being said, to connect services together you might better define a network to be shared between services. The only scenario I can imagine to require shared network namespace is for one of the service to access the other as localhost without the ability for you to change this behavior.

Answer 14 · 2023-02-01T19:06:17.000Z

The only scenario I can imagine to require shared network namespace is for one of the service to access the other as localhost without the ability for you to change this behavior.

opening this issue again as the described situation is the one I am in

Answer 15 · 2023-02-14T06:54:36.000Z

For another use case for a feature like this, check out the Gluten project (https://github.com/qdm12/gluetun).

It's a VPN container that routes all the network traffic in the namespace through a VPN tunnel. So, any containers connected via the network_mode:container-name have their network traffic routed through the tunnel. This is great for applications which do not support proxy routing at the application level.

A feature that allows the child network to be re-connected automatically if the parent is restart would be fantastic for those containers that are dependent on gluetun for a secure connection to somewhere else.

Answer 16 · 2023-02-19T11:30:02.000Z

I want to push this.

In my situation i use an vpn container and connect serval other containers via network_mode: "container:mycontainer"
sometimes i have to restart the VPN, to change the server, or just for maintenance. And after that, i have to manually restart all the child containers. I know, that i can write everything to the same compose file, but then i lose flexability.

A good behavior would be an option like:
restart: on_network
And then the child container restarts, if it loses the network connection. In the next Step this check shuld be done in configurable intervals, to prevent countless container restarts.

Kind Regards

Answer 17 · 2023-04-30T07:16:10.000Z

Could we keep this issue open?

Answer 18 · 2023-07-25T18:20:15.000Z

Can someone reopen this issue?

Answer 19 · 2023-09-11T12:41:01.000Z

network_mode implies an explicit depends_on between services, and as such the "mother" service does already restart the depending services:

$ cat compose.yaml 
services:
  mother:
    image: nginx
  app:
    image: nginx
    network_mode: "service:mother"

$ docker compose up -d
[+] Building 0.0s (0/0)                                    docker:desktop-linux
[+] Running 3/3
 ✔ Network chose_default     Created                                       0.0s 
 ✔ Container chose-mother-1  Started                                       0.0s 
 ✔ Container chose-app-1     Started                                       0.0s 
$ docker compose restart mother
[+] Restarting 2/2
 ✔ Container chose-mother-1  Started                                       0.3s 
 ✔ Container chose-app-1     Started                                       0.0s

if this is not what you get, please open a new issue with details on your configuration

Answer 20 · 2023-09-11T15:21:05.000Z

just tested, and if i restart the mother, in my implementation, doesn't trigger a restart on the child...

Answer 21 · 2023-09-11T15:29:57.000Z

@gionag did you tried my example? Which version of compose are you running?

Answer 22 · 2023-09-11T15:56:52.000Z

As far as I understand the reasoning of the others they might mean that in case of a container crash (restart: always) or something similar (like manual docker container restart) the children aren't restarted. A restart only happens with the explicit compose restart command.

Answer 23 · 2023-09-11T16:31:10.000Z

network_mode implies an explicit depends_on between services, and as such the "mother" service does already restart the depending services:

$ cat compose.yaml 
services:
  mother:
    image: nginx
  app:
    image: nginx
    network_mode: "service:mother"

$ docker compose up -d
[+] Building 0.0s (0/0)                                    docker:desktop-linux
[+] Running 3/3
 ✔ Network chose_default     Created                                       0.0s 
 ✔ Container chose-mother-1  Started                                       0.0s 
 ✔ Container chose-app-1     Started                                       0.0s 
$ docker compose restart mother
[+] Restarting 2/2
 ✔ Container chose-mother-1  Started                                       0.3s 
 ✔ Container chose-app-1     Started                                       0.0s

if this is not what you get, please open a new issue with details on your configuration

Tested it and if I need idk, update the mother container with a newer image, add some environment variable, recreate the container (with the same name), I need to attach the mother network again (i'm using portainer)

Answer 24 · 2023-09-11T19:54:09.000Z

Obviously this only applies when compose recreate the mother container. Any other scenario where user re-create container or container restart after a crash isn't managed by Compose

Answer 25 · 2023-09-12T00:03:11.000Z

Obviously this only applies when compose recreate the mother container. Any other scenario where user re-create container or container restart after a crash isn't managed by Compose

Hence the bug. If we think this doesn't belong in compose, then a bug in the docker runtime should probably track it?

@ndeloof since you're part of the docker organization, where do you think this should be tracked? At the end of the day I think we all want to see this bug/feature fixed/implemented.

I think part of the disconnect here is on what the purpose of docker-compose is. If I'm interpreting your comment correctly, compose is only intended to reconcile what's actually running in the docker runtime when it's directly invoked.

Others potentially expect the conditions & restrictions that are specified in a compose file to be used to continuously reconcile the state the container runtime is in. E.g. if something causes the docker runtime to put a container out of its intended state, then compose kicks in and reconciles the changes.

Maybe what we're asking here for is runtime continuous dependencies between different containers, vs at-the-time-of-command dependencies between the container definitions.

Answer 26 · 2023-09-12T04:53:25.000Z

If we think this doesn't belong in compose, then a bug in the docker runtime should probably track it?

Definitively not under compose scope as long as events don't take place under its control.
I also don't think this should be reported to docker runtime: as you replace a resource, invalidating those which depends on it, it is your responsibility to manage the reconciliation. This is what compose offers when you use up to recreate container.
Is there any reason you want to do this on your own ?

Answer 27 · 2024-04-03T20:34:42.000Z

Would love this as well. I have a VPN container and all dependents on it lose network if this container is restarted/crashed etc.

Answer 28 · 2024-04-03T21:02:15.000Z

@Fossil01 engine is not aware of relation between services declared in compose, so it can't manage such a "cascade" restart.

Answer 29 · 2024-04-03T21:42:46.000Z

It probably should be aware of such things.

Answer 30 · 2024-04-04T06:05:29.000Z

@melyux this should be discussed on github.com/moby/moby
my 2 cents: engine already manages restart policy "on failure", maybe it could also manage shared-namespace source being restarted