rpki-client/rpki-client-container

rpki-client-container 8.5 issues

BenCastricum opened this issue · 5 comments

With the arrival of the haproxy in the container we encountered several issues due to it. When haproxy fails, multirun also kills rpki-client. This causes the container to stop (or restart, depending on docker config)

We encountered the following cases while trying to upgrade to 8.5:
haproxy does not start when no metrics file exists
haproxy does not start when the metrics file exists and is larger then 16384 bytes (normally the metrics file is about 60000, so definitely too large)
haproxy does not start when there is no IPv6 configured on the system

the first 2 issues where fixed by have rpki-client.sh (manually) run once completely, this changes the tune.bufsize in haproxy.cfg to allow the larger metric file.
I ended up making my own haproxy.cfg file and mounting the config directory in the container on /etc/haproxy. The customised haproxy.cfg had a higher (100k) tune.bufsize, and does not require an IPv6 interface. Without these modifications I could not get the container to run properly.

I hope this helps someone.

First of all, thank you very much for reporting this! Let's dig into it…

haproxy does not start when no metrics file exists

How do you get into this situation exactly? rpki-client.sh:21 creates this file when the container is started.

haproxy does not start when the metrics file exists and is larger then 16384 bytes (normally the metrics file is about 60000, so definitely too large)

How do you get into this situation exactly? rpki-client.sh:31 changes the configuration file as needed.

haproxy does not start when there is no IPv6 configured on the system

What does this mean? No IPv6 link local address? Or just no IPv6 unicast addresses?

You got a point on the creation of the metrics file in rpki-client.sh. That seems to work.

the haproxy.cfg modification for tune.bufsize is done after the first successful run rpki client. The haproxy already aborts on start, so that's much sooner than when the bufsize gets adjusted. You can see that too if you remove the -q flag in the haproxy
command en entrypoint.sh:

WARNING: rpki-client may need more than the available disk space
on the file-system holding /var/cache/rpki-client.
available space: 7941632kB, suggested minimum 512000kB
available inodes 262549, suggested minimum 300000

[NOTICE] (18) : haproxy version is 2.6.14-5188364
[ALERT] (18) : config : parsing [/etc/haproxy/haproxy.cfg:37] : error detected in backend 'rpki-client' while parsing 'http-request return' rule : file '/var/lib/rpki-client/metrics' exceeds the buffer size (599218 > 16384).
[ALERT] (18) : config : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg
multirun: one or more of the provided commands ended abnormally

We don't have any IPv6 support configured on our system, no interfaces, no addresses:

/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 00:50:56:be:07:7a brd ff:ff:ff:ff:ff:ff
inet 10.117.220.139/27 brd 10.117.220.159 scope global ens192
valid_lft forever preferred_lft forever
/ # haproxy -f /etc/haproxy/haproxy.cfg -W -S /run/haproxy.sock
[NOTICE] (65) : haproxy version is 2.6.14-5188364
[ALERT] (65) : Binding [/etc/haproxy/haproxy.cfg:33] for frontend openmetrics: cannot create receiving socket (Address family not supported by protocol) for [:::9099]
[ALERT] (65) : [haproxy.main()] Some protocols failed to start their listeners! Exiting.
/

Thanks for looking into this. I hope the info above helps.

the haproxy.cfg modification for tune.bufsize is done after the first successful run rpki client.

Yes, because I didn't think of persistent volumes (which are a very valid use-case).

[ALERT] (65) : Binding [/etc/haproxy/haproxy.cfg:33] for frontend openmetrics: cannot create receiving socket (Address family not supported by protocol) for [:::9099]
[ALERT] (65) : [haproxy.main()] Some protocols failed to start their listeners! Exiting.

I had to pass ipv6.disable=1 while booting to reproduce this…and it led to the conclusion that I need to fix other containers regarding this, too.

Thanks for looking into this. I hope the info above helps.

Yes, it helped – thank you for providing the details. I've committed and built 49a42e7 with hopefully all fixes for the master and the 8.5 branch. Your feedback would be appreciated, especially if there are any issues left.

I am running it now, and it started without issues. Looks very promising. Thanks!

Thank you for the quick feedback. Feel free to comment here (or open a new issue if something new arises).