[Bug] preBootstrapCommands is not working in AL2023
xiangyanw opened this issue · 11 comments
What were you trying to accomplish?
I want to mount a data volume to EKS node with AL2023 by preBootstrapCommands.
What happened?
I configured preBootstrapCommands for a managed nodegroup in EKS version 1.30, but those commands were not added to the userdata.
Here is my preBootstrapCommands:
preBootstrapCommands:
- "sudo mkfs.xfs /dev/nvme1n1; sudo mkdir -p /var/lib/containerd ;sudo echo /dev/nvme1n1 /var/lib/containerd xfs defaults,noatime 1 2 >> /etc/fstab"
- "sudo mount -a"
Here is the resulting userdata in the launchtemplate:
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=78e7aff85774192583069ede05ed2bd166f9168b5ca780bcb90184ac8c40
--78e7aff85774192583069ede05ed2bd166f9168b5ca780bcb90184ac8c40
Content-Type: text/x-shellscript
Content-Type: charset="us-ascii"
#!/bin/bash
set -o errexit
set -o pipefail
set -o nounset
touch /run/xtables.lock
--78e7aff85774192583069ede05ed2bd166f9168b5ca780bcb90184ac8c40--
How to reproduce it?
Use the following YAML to create a nodegroup for EKS 1.30. Execute command: eksctl create ng -f xxx.yaml
- name: nodegroup
instanceType: c6a.large
minSize: 0
desiredCapacity: 1
maxSize: 2
volumeSize: 30
volumeType: 'gp3'
privateNetworking: true
preBootstrapCommands:
- "sudo mkfs.xfs /dev/nvme1n1; sudo mkdir -p /var/lib/containerd ;sudo echo /dev/nvme1n1 /var/lib/containerd xfs defaults,noatime 1 2 >> /etc/fstab"
- "sudo mount -a"
additionalVolumes:
- volumeName: '/dev/xvdb' # required
volumeSize: 50
volumeType: 'gp3'
Logs
2024-07-29 03:13:13 [ℹ] nodegroup "xxxx-nodegroup" will use "" [AmazonLinux2023/1.30]
2024-07-29 03:13:13 [ℹ] nodegroup "nodegroup" will use "" [AmazonLinux2023/1.30]
2024-07-29 03:13:17 [ℹ] 1 existing nodegroup(s) (xxxx-nodegroup) will be excluded
2024-07-29 03:13:17 [ℹ] 1 nodegroup (nodegroup) was included (based on the include/exclude rules)
2024-07-29 03:13:17 [ℹ] will create a CloudFormation stack for each of 1 managed nodegroups in cluster "xxxx"
2024-07-29 03:13:17 [ℹ]
2 sequential tasks: { fix cluster compatibility, 1 task: { 1 task: { create managed nodegroup "nodegroup" } }
}
2024-07-29 03:13:17 [ℹ] checking cluster stack for missing resources
2024-07-29 03:13:19 [ℹ] cluster stack has all required resources
2024-07-29 03:13:21 [ℹ] building managed nodegroup stack "eksctl-xxxx-nodegroup-nodegroup"
2024-07-29 03:13:22 [ℹ] deploying stack "eksctl-xxxx-nodegroup-nodegroup"
2024-07-29 03:13:22 [ℹ] waiting for CloudFormation stack "eksctl-xxxx-nodegroup-nodegroup"
2024-07-29 03:13:53 [ℹ] waiting for CloudFormation stack "eksctl-xxxx-nodegroup-nodegroup"
2024-07-29 03:14:44 [ℹ] waiting for CloudFormation stack "eksctl-xxxx-nodegroup-nodegroup"
2024-07-29 03:16:22 [ℹ] waiting for CloudFormation stack "eksctl-xxxx-nodegroup-nodegroup"
2024-07-29 03:16:22 [ℹ] no tasks
2024-07-29 03:16:22 [✔] created 0 nodegroup(s) in cluster "xxxx"
2024-07-29 03:16:22 [✔] created 1 managed nodegroup(s) in cluster "xxxx"
2024-07-29 03:16:24 [ℹ] checking security group configuration for all nodegroups
2024-07-29 03:16:24 [ℹ] all nodegroups have up-to-date cloudformation templates
Anything else we need to know?
This is working as expected when I use AL2 AMI in the same cluster.
- name: nodegroup2
amiFamily: AmazonLinux2
instanceType: c6a.large
minSize: 0
desiredCapacity: 1
maxSize: 2
volumeSize: 30
volumeType: 'gp3'
privateNetworking: true
preBootstrapCommands:
- "sudo mkfs.xfs /dev/nvme1n1; sudo mkdir -p /var/lib/containerd ;sudo echo /dev/nvme1n1 /var/lib/containerd xfs defaults,noatime 1 2 >> /etc/fstab"
- "sudo mount -a"
additionalVolumes:
- volumeName: '/dev/xvdb' # required
volumeSize: 50
volumeType: 'gp3'
Versions
eksctl version: 0.187.0
kubectl version: v1.24.0
OS: linux
preBootstrapCommands
is not supported for AL2023 nodegroups. This validation exists for self-managed nodegroups but is missing for managed nodegroups, so create nodegroup
silently ignores that field rather than failing early with an error. We'll work on a fix soon.
What is the alternative if preBootstrapCommands
is not supported for AL2023?
What is the alternative if
preBootstrapCommands
is not supported for AL2023?
I agree, what should we use instead? The question perhaps should be: Are there any plans to create something more or less equivalent to preBootstrapCommands available in AL2023? This is the one thing that stops us from using AL2023.
we NEED preBootstrapCommands to work because we rely on it to provide custom ca-certificates to pull container images from a private container registry
preBootstrapCommands
is not supported for AL2023 nodegroups. This validation exists for self-managed nodegroups but is missing for managed nodegroups, socreate nodegroup
silently ignores that field rather than failing early with an error. We'll work on a fix soon.
AL2023 is now the default, so please understand this is going to affect a lot of customers without them even realizing it.
@TiberiuGC any update on when something will be supported for AL2023?
AL2023 is now the default, so please understand this is going to affect a lot of customers without them even realizing it.
My take on this is that the most urgent matter is adding a validation for managed nodegroups, so that we don't end up impacting customers in the way described above. We'll likely have a fix for this next week.
As for preBootstrapCommand
/ overrideBootstrapCommand
alternatives for AL2023, I don't have a date to share yet. I'll bump this internally so we can correctly asses where it stands in our backlog of priorities. But I can appreciate there's considerable community interest, I'll make sure to articulate that.
@TiberiuGC Just ran into this issue myself and burned a few hours troubleshooting. I use preBootstrapCommands
to inject HTTP proxy env vars and this is a must have for working in a locked down corporate environment.
A warning message with instructions to fallback to Amazon Linux 2 would be helpful, but this is really a showstopper for enterprise customers. I simply can't use AL2023 without injecting HTTP proxy settings.
Also tell management this disproportionally impacts enterprise customers who have fat budgets and are looking to spin up massive instances to run their internal apps that maybe a handful of people actually use and then turn around and forget they're running...forever. So much compute billing...
Happy to help however I can. Where would one start if they're interested in injecting preBootstrapCommands
in AL2023?
@jonathanfoster, we are working on adding support for preBootstrapCommands
in AL2023. Please stay tuned.
Any ETA ?
+1 🆙