vitobotta/hetzner-k3s

testing v2.0.0: autoscale node-ip

axgkl opened this issue · 5 comments

Hi,
I'm testing autoscaling on priv ip clusters.

After this #394 (comment) the autoscaler creates the nodes and they join the cluster. All kube-system pods for the new ones running.

Only left problem: Scheduling 6 pods with an affinity rule that they should be on 6 differnent hosts, only the 3 on my 3 masters get into running - the others remain pending.

ccm log says:

E0731 20:39:48.001894       1 node_controller.go:240] error syncing 'medium-autoscaled-2f625ed856f50e8': failed to get node modifiers from cloud provider: provided node ip for node "medium-autoscaled-2f625ed856f50e8" is not valid: failed to get node address from cloud provider that matches ip: 100.66.1.123, requeuing

And those nodes kept having the unschedulable taint.

Then I went on an autoscaled node via ssh and reconfigured the node-ip in /etc/systemd/system/k3s-agent.service away from the 100.66.1.123 to the ip within the private network (the 10.1.0.10 below) - and restarted the service.

=> Instantly working pod, i.e. that was the culprint.

 
❯ kc get nodes -o wide                                                                               !?
NAME                                 STATUS   ROLES                       AGE     VERSION        INTERNAL-IP      EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION     CONTAINER-RUNTIME
ax-master1                           Ready    control-plane,etcd,master   24m     v1.30.2+k3s2   10.1.0.5         <none>        Ubuntu 24.04 LTS   6.8.0-38-generic   containerd://1.7.17-k3s1
ax-master2                           Ready    control-plane,etcd,master   24m     v1.30.2+k3s2   10.1.0.3         <none>        Ubuntu 24.04 LTS   6.8.0-38-generic   containerd://1.7.17-k3s1
ax-master3                           Ready    control-plane,etcd,master   25m     v1.30.2+k3s2   10.1.0.4         <none>        Ubuntu 24.04 LTS   6.8.0-38-generic   containerd://1.7.17-k3s1
medium-autoscaled-16f63cb4d524e27b   Ready    <none>                      5m43s   v1.30.2+k3s2   10.1.0.10        <none>        Ubuntu 24.04 LTS   6.8.0-38-generic   containerd://1.7.17-k3s1
medium-autoscaled-20a343c4e5b82278   Ready    <none>                      9m5s    v1.30.2+k3s2   100.66.12.139    <none>        Ubuntu 24.04 LTS   6.8.0-38-generic   containerd://1.7.17-k3s1
medium-autoscaled-3cf3b4eb87977ee3   Ready    <none>                      5m22s   v1.30.2+k3s2   100.66.240.199   <none>        Ubuntu 24.04 LTS   6.8.0-38-generic   containerd://1.7.17-k3s1

and

❯ kc get pods -o wide                                                                                !?
NAME                            READY   STATUS    RESTARTS   AGE   IP            NODE                                 NOMINATED NODE   READINESS GATES
xhello-world-6bf6c965df-547ft   1/1     Running   0          31m   10.50.2.160   ax-master1                           <none>           <none>
xhello-world-6bf6c965df-58n54   1/1     Running   0          31m   10.50.1.222   ax-master2                           <none>           <none>
xhello-world-6bf6c965df-9kqlb   1/1     Running   0          31m   10.50.0.133   ax-master3                           <none>           <none>
xhello-world-6bf6c965df-h2k4h   1/1     Running   0          31m   10.50.7.112   medium-autoscaled-16f63cb4d524e27b   <none>           <none>
xhello-world-6bf6c965df-pqdpj   0/1     Pending   0          31m   <none>        <none>                               <none>           <none>
xhello-world-6bf6c965df-x2h4n   0/1     Pending   0          31m   <none>        <none>                               <none>           <none>

will dig deeper regarding why you take the eth0 ip as --node-ip and not the private network one.

details: On a non working autoscaled node:

root@medium-autoscaled-44fb554df4d13d7a:~# cat /etc/systemd/system/k3s-agent.service
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target

[Install]
WantedBy=multi-user.target

[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s-agent.service.env
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service 2>/dev/null'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s \
    agent \
	'--node-name=medium-autoscaled-44fb554df4d13d7a' \
	'--kubelet-arg' \
	'cloud-provider=external' \
	'--kubelet-arg' \
	'resolv-conf=/etc/k8s-resolv.conf' \
	'--node-ip=100.66.224.163' \
	'--node-external-ip=100.66.224.163' \

cat /var/lib/cloud/instances/51141072/user-data.txt

(...)
- echo "Done" > /.status
- |
    touch /etc/initialized

    if [[ $(</etc/initialized) != "true" ]]; then
    	systemctl restart NetworkManager || true
    	dhclient eth1 -v || true
    fi

    HOSTNAME=$(hostname -f)
    PUBLIC_IP=$(hostname -I | awk '{print $1}')

    if [[ "true" = "true" ]]; then
    	PRIVATE_IP=$(ip route get 10.1.0.0 | awk -F"src " 'NR==1{split($2,a," ");print a[1]}')
    	NETWORK_INTERFACE=" --flannel-iface=$(ip route get 10.1.0.0 | awk -F"dev " 'NR==1{split($2,a," ");print a[1]}') "
    else
    	PRIVATE_IP="${PUBLIC_IP}"
    	NETWORK_INTERFACE=" "
    fi

    mkdir -p /etc/rancher/k3s

    cat >/etc/rancher/k3s/registries.yaml <<EOF
    mirrors:
      "*":
    EOF

    curl -sfL https://get.k3s.io | K3S_TOKEN="8d43d9f9268fb60cecefd04684485df0" INSTALL_K3S_VERSION="v1.30.2+k3s2" K3S_URL=https://10.1.0.4:6443 INSTALL_K3S_EXEC="agent \
    --node-name=$HOSTNAME  --kubelet-arg "cloud-provider=external"  --kubelet-arg "resolv-conf=/etc/k8s-resolv.conf"  \
    --node-ip=$PRIVATE_IP \
    --node-external-ip=$PUBLIC_IP \
    $NETWORK_INTERFACE " sh -

    echo true >/etc/initialized


When I run these commands from PUBLIC_IP=... in the shell:

# PUBLIC_IP=$(hostname -I | awk '{print $1}')

    if [[ "true" = "true" ]]; then
        PRIVATE_IP=$(ip route get 10.1.0.0 | awk -F"src " 'NR==1{split($2,a," ");print a[1]}')
        NETWORK_INTERFACE=" --flannel-iface=$(ip route get 10.1.0.0 | awk -F"dev " 'NR==1{split($2,a," ");print a[1]}') "
    else
        PRIVATE_IP="${PUBLIC_IP}"
        NETWORK_INTERFACE=" "
    fi
root@medium-autoscaled-44fb554df4d13d7a:~# echo $PRIVATE_IP
10.1.0.7
root@medium-autoscaled-44fb554df4d13d7a:~# curl -sfL https://get.k3s.io | K3S_TOKEN="8d43d9f9268fb60cecefd04684485df0" INSTALL_K3S_VERSION="v1.30.2+k3s2" K3S_URL=https://10.1.0.4:6443 INSTALL_K3S_EXEC="agent \
    --node-name=$HOSTNAME  --kubelet-arg "cloud-provider=external"  --kubelet-arg "resolv-conf=/etc/k8s-resolv.conf"  \
    --node-ip=$PRIVATE_IP \
    --node-external-ip=$PUBLIC_IP \
    $NETWORK_INTERFACE " sh -
[INFO]  Using v1.30.2+k3s2 as release
[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.30.2+k3s2/sha256sum-amd64.txt
[INFO]  Skipping binary downloaded, installed k3s matches hash
[INFO]  Skipping installation of SELinux RPM
[INFO]  Skipping /usr/local/bin/kubectl symlink to k3s, already exists
[INFO]  Skipping /usr/local/bin/crictl symlink to k3s, already exists
[INFO]  Skipping /usr/local/bin/ctr symlink to k3s, already exists
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s-agent.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s-agent.service
[INFO]  systemd: Enabling k3s-agent unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.
[INFO]  systemd: Starting k3s-agent
root@medium-autoscaled-44fb554df4d13d7a:~# tail /etc/systemd/system/k3s-agent.service
    agent \
	'--node-name=medium-autoscaled-44fb554df4d13d7a' \
	'--kubelet-arg' \
	'cloud-provider=external' \
	'--kubelet-arg' \
	'resolv-conf=/etc/k8s-resolv.conf' \
	'--node-ip=10.1.0.7' \
	'--node-external-ip=100.66.224.163' \
	'--flannel-iface=enp7s0' \

root@medium-autoscaled-44fb554df4d13d7a:~#

and, btw:

root@medium-autoscaled-44fb554df4d13d7a:~# cloud-init status
status: done

=> All fine.

So: Why did cloud init produce a different result, as if if [[ "true" = "true" ]]; failed (also no flannel setting, looks really like that condition failed).

.

aaaah, there we go, "[[" is a bashism, not POSIX standard:

root@medium-autoscaled-44fb554df4d13d7a:~# /bin/sh
# if [[ "true" = "true" ]]; then echo hi; else echo ho; fi
/bin/sh: 1: [[: not found
ho

Cloud-init scripts are executed in a sh shell. This is a POSIX-compliant shell, similar to bash, but with fewer features.

Will compile and test with /bin/sh compliant version.

cool, that fixed it, finally, autoscaling works end to end, on priv ip clusters:

❯ kc get pods -o wide                                                                                !?
NAME                            READY   STATUS    RESTARTS   AGE     IP            NODE                                 NOMINATED NODE   READINESS GATES
xhello-world-6bf6c965df-56whk   1/1     Running   0          2m38s   10.50.3.179   medium-autoscaled-68f6454a9986e5bd   <none>           <none>
xhello-world-6bf6c965df-dm2p5   1/1     Running   0          2m38s   10.50.2.241   ax-master2                           <none>           <none>
xhello-world-6bf6c965df-dsbkb   1/1     Running   0          2m38s   10.50.4.236   medium-autoscaled-649a0eab32ddd7f5   <none>           <none>
xhello-world-6bf6c965df-v92kb   1/1     Running   0          2m38s   10.50.1.17    ax-master3                           <none>           <none>
xhello-world-6bf6c965df-wscf4   1/1     Running   0          2m38s   10.50.5.31    medium-autoscaled-3079dca71025eda    <none>           <none>
xhello-world-6bf6c965df-zf599   1/1     Running   0          2m38s   10.50.0.173   ax-master1                           <none>           <none>

Note: It did work on non priv ip clusters, due to the else branch of the if condition executed, i.e. the one for pub ips.

 if [[ "true" = "true" ]]; then
    	PRIVATE_IP=$(ip route get 10.1.0.0 | awk -F"src " 'NR==1{split($2,a," ");print a[1]}')
    	NETWORK_INTERFACE=" --flannel-iface=$(ip route get 10.1.0.0 | awk -F"dev " 'NR==1{split($2,a," ");print a[1]}') "
    else
    	PRIVATE_IP="${PUBLIC_IP}"
    	NETWORK_INTERFACE=" "
    fi
    
With this in /bin/sh:


# if [[ "true" = "true" ]]; then echo hi; else echo ho; fi
/bin/sh: 1: [[: not found
ho

Do you still have problems with this?

Closing since I just released v2.0.0 and this stuff seems to work fine from my testing today. Please open another issue if still needed with the new version. Thanks for the PRs!