Support `FDSTORE=1` (sd_pid_notify_with_fds) to restart a container without losing active TCP connections
eriksjolund opened this issue · 0 comments
I would like to have a container use
sd_pid_notify_with_fds(0, 0, "FDSTORE=1\nFDNAME=foobar", &fd, 1);
(see man sd_notify
) to store an active TCP socket. It would then be possible to restart a container that has an active TCP connection without the container losing the connection. (Containers generally don't use sd_pid_notify_with_fds()
so it would only work for containers with support for it).
Scenario: /usr/bin/testserver
in the container image IMG:1 supports socket activation. A client on the internet connects. The testserver is started and calls accept()
on the leaked-in socket. A new container image IMG:2 is released. The sysadmin wants to upgrade to IMG:2 without having to disconnect the active TCP connection.
The same scenario described in more detail:
sudo useradd test1
sudo machinectl shell test1@
podman image tag IMG:1 tmptag:latest
podman create --rm --name test --network none tmptag:latest /usr/bin/testserver
mkdir -p ~/.config/systemd/user
podman generate systemd --name --new test > ~/.config/systemd/user/test.service
- create the file ~/.config/systemd/user/test.socket with the file contents
[Unit] Description=test server [Socket] ListenStream=0.0.0.0:3000 [Install] WantedBy=default.target
systemctl --user start test.socket
- a client on the internet connects to TCP port 3000.
/usr/bin/testserver
is started with the leaked-in socket (from systemd socket activation)- the testserver calls
accept()
on the socket - A new container image image
IMG:2
is available.
podman image tag IMG:2 tmptag:latest
- the sysadmin somehow informs the testserver that it needs to send the active TCP socket file descriptor to systemd with
FDSTORE=1
- testserver sends the active TCP socket file descriptor to systemd and gives it the name foobar
sd_pid_notify_with_fds(0, 0, "FDSTORE=1\nFDNAME=foobar", &fd, 1);
- testserver terminates with an unclean exit code (see man systemd.service) so that systemd
will try to restart the service. (Alternatively the sysadmin could also runsystemctl --user restart test.service
). /usr/bin/testserver
is started again but this time it also inherits the file descriptor that was previously stored withFDSTORE=1
.- testserver calls
sd_listen_fds_with_names()
In the above example it was assumed that the /usr/bin/testserver
is running stateless (except for having the active TCP connection). In a more realistic scenario /usr/bin/testserver
would also need to store its internal application state before restarting. The normal file system could be used to save such a file, or even better memfd_create(2)
could be used.
Copy-paste from man sd_notify:
Application state can either be serialized to a file in /run/, or better, stored in a memory file descriptor. memfd_create(2) (See also memfd_create(2)).
In other words, it would be good if conmon would also support sending a memfd with FDSTORE=1
.
Maybe even more file descriptor types could be allowed to be sent? An idea: a new Podman command-line option could adjust what type of file descriptors are allowed to be sent.
Previous discussion:
Extra note: I don't have any direct need for this feature right now. I just think it could be a useful feature.