bottlerocket-os/bottlerocket

How to disable IPv6 DAD to reduce startup delay of pods on IPv6 cluster

woehrl01 opened this issue · 9 comments

What I'd like:

We want to change the sysctl value net.ipv6.conf.all.optimistic_dad=1

We created a bootstrap container executing the following script:

#!/bin/bash

set -ex

nsenter -t 1 -m sysctl -w net.ipv6.conf.all.optimistic_dad=1

but it fails with:

+ nsenter -t 1 -m sysctl -w net.ipv6.conf.all.optimistic_dad=1
nsenter: cannot open /proc/1/ns/mnt: Permission denied

How to change that?

Any alternatives you've considered:

None that I'm aware of. Executing that from an admin-container via sheltie changes that value, successfully.

Related to: aws/amazon-vpc-cni-k8s#1631

Apologies, the right way to set this is:

[settings.kernel.sysctl]
"net.ipv6.conf.all.optimistic_dad" = "1"
"net.ipv6.conf.default.optimistic_dad" = "1"

Thank you for your update. I would love to know if (as I hope) you see measurably faster startup with optimistic duplicate address detection.

@larvacea Unfortunately changing optimistic_dad = 1 or accept_dad = 0 regardless of the interface does not have any impact on the startup latency. There is currently still a 2-3 second delay on a IPv6 pod startup (compared to ipv4). I can confirm that the value is picked up by the vethd* interfaces, created for the sandboxes.

@larvacea @woehrl01 a couple of other ideas:

Optimistic DAD might need to be combined with "use_optimistic", in order to actually make use of the tentative addresses. Also, given the evidence that DAD is being performed despite accept_dad = 0, we could try setting dad_transmits = 0 to override it:

[settings.kernel.sysctl]
# don't enable DAD 
"net.ipv6.conf.all.accept_dad" = "0"
"net.ipv6.conf.default.accept_dad" = "0"

# don't transmit any DAD probes
"net.ipv6.conf.all.dad_transmits" = "0"
"net.ipv6.conf.default.dad_transmits" = "0"

# if we end up using DAD, go ahead and use the tentative addresses
"net.ipv6.conf.all.optimistic_dad" = "1"
"net.ipv6.conf.all.use_optimistic" = "1"

Thank you @bcressey I just tried your configuration and also additonal permutations, the startup delay of around 2 second still persist:

IPv4 Cluster:

Bildschirmfoto 2024-04-26 um 20 07 54

IPv6 Cluster (with the settings):

Bildschirmfoto 2024-04-26 um 20 07 13

bash-5.1# tail -n +1 /proc/sys/net/ipv6/conf/*/*dad*
==> /proc/sys/net/ipv6/conf/all/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/all/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/all/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/all/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/default/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/default/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/default/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/default/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/eni0dfdceb3448/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/eni0dfdceb3448/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/eni0dfdceb3448/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/eni0dfdceb3448/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/eni19314c3cd96/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/eni19314c3cd96/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/eni19314c3cd96/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/eni19314c3cd96/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/eni79b4cbaf095/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/eni79b4cbaf095/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/eni79b4cbaf095/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/eni79b4cbaf095/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/eni8d1aa624f0c/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/eni8d1aa624f0c/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/eni8d1aa624f0c/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/eni8d1aa624f0c/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/eni8f2e97e2322/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/eni8f2e97e2322/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/eni8f2e97e2322/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/eni8f2e97e2322/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/enid559aefed0e/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/enid559aefed0e/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/enid559aefed0e/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/enid559aefed0e/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/enie114b69e62e/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/enie114b69e62e/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/enie114b69e62e/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/enie114b69e62e/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/eth0/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/eth0/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/eth0/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/eth0/optimistic_dad <==
0

==> /proc/sys/net/ipv6/conf/lo/accept_dad <==
-1

==> /proc/sys/net/ipv6/conf/lo/dad_transmits <==
1

==> /proc/sys/net/ipv6/conf/lo/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/lo/optimistic_dad <==
0

==> /proc/sys/net/ipv6/conf/veth1003d20a/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/veth1003d20a/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/veth1003d20a/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/veth1003d20a/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/veth1de24e44/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/veth1de24e44/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/veth1de24e44/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/veth1de24e44/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/veth2df40afd/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/veth2df40afd/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/veth2df40afd/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/veth2df40afd/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/veth4024dff7/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/veth4024dff7/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/veth4024dff7/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/veth4024dff7/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/veth524efe7e/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/veth524efe7e/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/veth524efe7e/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/veth524efe7e/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/vetha8d8bf98/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/vetha8d8bf98/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/vetha8d8bf98/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/vetha8d8bf98/optimistic_dad <==
1

==> /proc/sys/net/ipv6/conf/vethce2339e3/accept_dad <==
0

==> /proc/sys/net/ipv6/conf/vethce2339e3/dad_transmits <==
0

==> /proc/sys/net/ipv6/conf/vethce2339e3/enhanced_dad <==
1

==> /proc/sys/net/ipv6/conf/vethce2339e3/optimistic_dad <==
1

Digging through some code around the web, I came past the following implementation in Android: https://android.googlesource.com/platform/frameworks/base/+/befe778%5E%21/#F0

It looks like that if optimistic_dad is enabled the IFA_F_TENTATIVE is set together with IFA_F_OPTIMISTIC resulting in the following check in the AWS VPC CNI to still fail until the DAD has succeeded: https://github.com/aws/amazon-vpc-cni-k8s/pull/1631/files#diff-afc7977e1f00abb3f66455a7d491ded671d38ffa43e0dc910606084ec4fd4841R250-R255

Still not sure why IFA_F_TENTATIVE is set when DAD is disabled. But I located the following (fixed) issue on Red Hat setting the address to tentative even if dad_transmits=0: https://bugzilla.redhat.com/show_bug.cgi?id=709271

edit:

I have some additional findings. Running the following script on a node with the above settings, clearly shows that there are no interfaces created in the tentative state. With default settings, the interfaces are shown in that state.

for i in {1..1000}; do ip -6 addr show | grep "tentative"; sleep 0.1; done

@woehrl01 based on your last update, this seems to be expected behavior, right?

I have some additional findings. Running the following script on a node with the above settings, clearly shows that there are no interfaces created in the tentative state. With default settings, the interfaces are shown in that state.

Do you mind clarifying the open request if one still exists?

@KCSesh there is a behaviour I don't understand. As the cni plugin clearly waits for 2 seconds in a tentative state even though DAD is disabled.

So the question is. Are there additional configurations which need to be applied to fully disable DAD, so that all interfaces are directly stable?