Custom TinkerbellTemplateConfig fails to work
Closed this issue · 5 comments
What happened: When using a custom TinkerbellTemplateConfig, the resulting tinkerbell workflow is empty and cluster fails to provision.
What you expected to happen: For the clusters to provision.
How to reproduce it (as minimally and precisely as possible):
Here's my cluster.yaml:
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-eksa-cluster
spec:
clusterNetwork:
cniConfig:
cilium: {}
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
controlPlaneConfiguration:
count: 1
endpoint:
host: "147.75.202.254"
machineGroupRef:
kind: TinkerbellMachineConfig
name: my-eksa-cluster-cp
datacenterRef:
kind: TinkerbellDatacenterConfig
name: my-eksa-cluster
kubernetesVersion: "1.23"
managementCluster:
name: my-eksa-cluster
workerNodeGroupConfigurations:
- count: 1
machineGroupRef:
kind: TinkerbellMachineConfig
name: my-eksa-cluster
name: md-0
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellDatacenterConfig
metadata:
name: my-eksa-cluster
spec:
tinkerbellIP: "147.75.202.253"
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellMachineConfig
metadata:
name: my-eksa-cluster-cp
spec:
hardwareSelector:
type: cp
osFamily: bottlerocket
templateRef:
kind: TinkerbellTemplateConfig
name: my-eksa-cluster
users:
- name: ec2-user
sshAuthorizedKeys:
- ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQC1Zweba/X6qrXQ6ubIkZHq1yFF9VRlMUiK457vtuI0Psdg73OLJmh67XmhZ6QkRQjToLYZ5PzppL4QVVPceyA5OHkh8E8HHg3JsZTynXo7YoneI7PQP6DIPjd3z4T28zox6gNNsVpkoeMPmCxJJg5y+9vz8PbEHsFUX9MmWYLWCgltXT+Cr/hudNNxZB4nD2EhNffrRsLlmxf/Cl8fH4xHBSB3W+AKit9cdIXM2SRxUQ2drq2HTiPuFv75+8t4ZvX4j/szV0Z9TguLR2vILzhv/K7FD1LMeGOS/fi5YdIoKy2/46j3ooeuP9OUUkFK5y1Q1dhbdtZWIn6ImmPkjEAAsWl4c4ApvycgkDlMqdKegspmYjtaf1yACacS4tAuZyhuNObMiX0SfwEisiNm8QgOfZsVBwrvAAL2qRosmDMKk5rQpKMsn7yXhbSwtEmFdnODymAxrKezy54C9H0xwDE0YER3FFf56/RzEaQ5Lfyh03kZcOdSe5nIGz4FlSWJ79S9VuS5nxx3kTgHOPa1G1D3MTps4bVUCcR4rJOqHQTPDIG+Xk5Zr377oG0VMQE664KbrcdJ7jujUpxV/Krnm7z/lzl9EkecNHYg8W83XVNqoIA5oZ5R0OmqceQkjOcunCOqQOOSxVoHGS0nreMine0HoVYCJg6vcws4Qc1qCiPiJQ==
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellMachineConfig
metadata:
name: my-eksa-cluster
spec:
hardwareSelector:
type: dp
osFamily: bottlerocket
templateRef:
kind: TinkerbellTemplateConfig
name: my-eksa-cluster
users:
- name: ec2-user
sshAuthorizedKeys:
- ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQC1Zweba/X6qrXQ6ubIkZHq1yFF9VRlMUiK457vtuI0Psdg73OLJmh67XmhZ6QkRQjToLYZ5PzppL4QVVPceyA5OHkh8E8HHg3JsZTynXo7YoneI7PQP6DIPjd3z4T28zox6gNNsVpkoeMPmCxJJg5y+9vz8PbEHsFUX9MmWYLWCgltXT+Cr/hudNNxZB4nD2EhNffrRsLlmxf/Cl8fH4xHBSB3W+AKit9cdIXM2SRxUQ2drq2HTiPuFv75+8t4ZvX4j/szV0Z9TguLR2vILzhv/K7FD1LMeGOS/fi5YdIoKy2/46j3ooeuP9OUUkFK5y1Q1dhbdtZWIn6ImmPkjEAAsWl4c4ApvycgkDlMqdKegspmYjtaf1yACacS4tAuZyhuNObMiX0SfwEisiNm8QgOfZsVBwrvAAL2qRosmDMKk5rQpKMsn7yXhbSwtEmFdnODymAxrKezy54C9H0xwDE0YER3FFf56/RzEaQ5Lfyh03kZcOdSe5nIGz4FlSWJ79S9VuS5nxx3kTgHOPa1G1D3MTps4bVUCcR4rJOqHQTPDIG+Xk5Zr377oG0VMQE664KbrcdJ7jujUpxV/Krnm7z/lzl9EkecNHYg8W83XVNqoIA5oZ5R0OmqceQkjOcunCOqQOOSxVoHGS0nreMine0HoVYCJg6vcws4Qc1qCiPiJQ==
---
{}
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellTemplateConfig
metadata:
name: my-eksa-cluster
spec:
template:
global_timeout: 6000
id: ""
name: my-eksa-cluster
tasks:
- actions:
- environment:
COMPRESSED: "true"
DEST_DISK: /dev/sda
IMG_URL: https://anywhere-assets.eks.amazonaws.com/releases/bundles/14/artifacts/raw/1-23/bottlerocket-v1.23.7-eks-d-1-23-4-eks-a-14-amd64.img.gz
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/image2disk:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
name: stream-image
timeout: 600
- environment:
CONTENTS: |
# Version is required, it will change as we support
# additional settings
version = 1
# "eno1" is the interface name
# Users may turn on dhcp4 and dhcp6 via boolean
[enp1s0f0np0]
dhcp4 = true
dhcp6 = false
# Define this interface as the "primary" interface
# for the system. This IP is what kubelet will use
# as the node IP. If none of the interfaces has
# "primary" set, we choose the first interface in
# the file
primary = true
DEST_DISK: /dev/sda12
DEST_PATH: /net.toml
DIRMODE: "0755"
FS_TYPE: ext4
GID: "0"
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
name: write-netplan
pid: host
timeout: 90
- environment:
BOOTCONFIG_CONTENTS: |
kernel {
console = "ttyS1,115200n8"
}
init {
systemd.log_level=debug
}
DEST_DISK: /dev/sda12
DEST_PATH: /bootconfig.data
DIRMODE: "0700"
FS_TYPE: ext4
GID: "0"
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
name: write-bootconfig
pid: host
timeout: 90
- environment:
DEST_DISK: /dev/sda12
DEST_PATH: /user-data.toml
DIRMODE: "0700"
FS_TYPE: ext4
GID: "0"
HEGEL_URLS: http://147.75.202.242:50061,http://147.75.202.253:50061
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
name: write-user-data
pid: host
timeout: 90
- image: public.ecr.aws/eks-anywhere/tinkerbell/hub/reboot:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
name: reboot-image
pid: host
timeout: 90
volumes:
- /worker:/worker
version: "0.1"
After using this with eksctl command eksctl anywhere create cluster --filename my-eksa-cluster.yaml --hardware-csv hardware.csv --tinkerbell-bootstrap-ip 147.75.202.242
the output looks good:
Warning: The recommended number of control plane nodes is 3 or 5
Warning: The recommended number of control plane nodes is 3 or 5
Performing setup and validations
✅ Tinkerbell Provider setup is valid
✅ Validate certificate for registry mirror
✅ Validate authentication for git provider
✅ Create preflight validations pass
Creating new bootstrap cluster
Provider specific pre-capi-install-setup on bootstrap cluster
Installing cluster-api providers on bootstrap cluster
Provider specific post-setup
Creating new workload cluster
However, it never gets past the creating new workload cluster step. The workload machines boot into linuxkit, but just stay there.
tink-controller logs are more illuminating:
root@eksa-admin:~# kubectl -n eksa-system logs tink-server-69bb8bc84-fnhfs
{"level":"info","ts":1661524365.8658762,"caller":"tink-server/main.go:249","msg":"no config file found","service":"github.com/tinkerbell/tink"}
{"level":"info","ts":1661524365.8661466,"caller":"metrics/metrics.go:58","msg":"initializing label values","service":"github.com/tinkerbell/tink"}
{"level":"info","ts":1661524365.8665445,"caller":"tink-server/main.go:130","msg":"starting version 8011b72","service":"github.com/tinkerbell/tink"}
{"level":"info","ts":1661524365.9585457,"logger":"fallback","caller":"manager/internal.go:362","msg":"Starting server","path":"/metrics","kind":"metrics","addr":"[::]:41147"}
{"level":"info","ts":1661524365.9585578,"logger":"fallback","caller":"manager/internal.go:362","msg":"Starting server","kind":"health probe","addr":"[::]:43409"}
{"level":"info","ts":1661524365.9593925,"caller":"tink-server/main.go:207","msg":"started listener","service":"github.com/tinkerbell/tink","address":"[::]:42113"}
{"level":"info","ts":1661524365.9595804,"caller":"http-server/http_server.go:31","msg":"serving http","service":"github.com/tinkerbell/tink"}
root@eksa-admin:~# kubectl -n eksa-system logs tink-controller-manager-7cbf8c4d66-pf68b
{"level":"info","ts":1661524367.067631,"caller":"tink-controller/main.go:106","msg":"no config file found","service":"github.com/tinkerbell/tink"}
{"level":"info","ts":1661524367.0677176,"caller":"tink-controller/main.go:60","msg":"starting controller version 8011b72","service":"github.com/tinkerbell/tink"}
{"level":"info","ts":1661524367.1584876,"logger":"fallback","caller":"manager/internal.go:362","msg":"Starting server","kind":"health probe","addr":"[::]:46545"}
{"level":"info","ts":1661524367.158503,"logger":"fallback","caller":"manager/internal.go:362","msg":"Starting server","path":"/metrics","kind":"metrics","addr":"[::]:43017"}
I0826 14:32:47.259668 1 leaderelection.go:248] attempting to acquire leader lease eksa-system/tink-leader-election...
I0826 14:32:47.271599 1 leaderelection.go:258] successfully acquired lease eksa-system/tink-leader-election
{"level":"info","ts":1661524367.271899,"logger":"fallback.controller.workflow","caller":"controller/controller.go:178","msg":"Starting EventSource","reconciler group":"tinkerbell.org","reconciler kind":"Workflow","source":"kind source: *v1alpha1.Workflow"}
{"level":"info","ts":1661524367.2719789,"logger":"fallback.controller.workflow","caller":"controller/controller.go:186","msg":"Starting Controller","reconciler group":"tinkerbell.org","reconciler kind":"Workflow"}
{"level":"info","ts":1661524367.2720811,"logger":"fallback.controller.workflow","caller":"controller/controller.go:220","msg":"Starting workers","reconciler group":"tinkerbell.org","reconciler kind":"Workflow","worker count":1}
{"level":"info","ts":1661524405.3835952,"logger":"fallback.controller.workflow","caller":"workflow/controller.go:40","msg":"Reconciling","reconciler group":"tinkerbell.org","reconciler kind":"Workflow","name":"my-eksa-cluster-control-plane-template-1661524403587-6lcgk","namespace":"eksa-system"}
{"level":"error","ts":1661524405.4845738,"logger":"fallback.controller.workflow","caller":"controller/controller.go:317","msg":"Reconciler error","reconciler group":"tinkerbell.org","reconciler kind":"Workflow","name":"my-eksa-cluster-control-plane-template-1661524403587-6lcgk","namespace":"eksa-system","error":"validating workflow template: name cannot be empty","errorVerbose":"name cannot be empty\ngithub.com/tinkerbell/tink/workflow.validate\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:127\ngithub.com/tinkerbell/tink/workflow.Parse\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:36\ngithub.com/tinkerbell/tink/workflow.RenderTemplateHardware\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:95\ngithub.com/tinkerbell/tink/pkg/controllers/workflow.(*Controller).processNewWorkflow\n\tgithub.com/tinkerbell/tink/pkg/controllers/workflow/controller.go:90\ngithub.com/tinkerbell/tink/pkg/controllers/workflow.(*Controller).Reconcile\n\tgithub.com/tinkerbell/tink/pkg/controllers/workflow/controller.go:60\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227\nruntime.goexit\n\truntime/asm_amd64.s:1581\nvalidating workflow template\ngithub.com/tinkerbell/tink/workflow.Parse\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:37\ngithub.com/tinkerbell/tink/workflow.RenderTemplateHardware\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:95\ngithub.com/tinkerbell/tink/pkg/controllers/workflow.(*Controller).processNewWorkflow\n\tgithub.com/tinkerbell/tink/pkg/controllers/workflow/controller.go:90\ngithub.com/tinkerbell/tink/pkg/controllers/workflow.(*Controller).Reconcile\n\tgithub.com/tinkerbell/tink/pkg/controllers/workflow/controller.go:60\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227\nruntime.goexit\n\truntime/asm_amd64.s:1581","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:317\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227"}
{"level":"info","ts":1661524405.49004,"logger":"fallback.controller.workflow","caller":"workflow/controller.go:40","msg":"Reconciling","reconciler group":"tinkerbell.org","reconciler kind":"Workflow","name":"my-eksa-cluster-control-plane-template-1661524403587-6lcgk","namespace":"eksa-system"}
{"level":"error","ts":1661524405.4907446,"logger":"fallback.controller.workflow","caller":"controller/controller.go:317","msg":"Reconciler error","reconciler group":"tinkerbell.org","reconciler kind":"Workflow","name":"my-eksa-cluster-control-plane-template-1661524403587-6lcgk","namespace":"eksa-system","error":"validating workflow template: name cannot be empty","errorVerbose":"name cannot be empty\ngithub.com/tinkerbell/tink/workflow.validate\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:127\ngithub.com/tinkerbell/tink/workflow.Parse\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:36\ngithub.com/tinkerbell/tink/workflow.RenderTemplateHardware\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:95\ngithub.com/tinkerbell/tink/pkg/controllers/workflow.(*Controller).processNewWorkflow\n\tgithub.com/tinkerbell/tink/pkg/controllers/workflow/controller.go:90\ngithub.com/tinkerbell/tink/pkg/controllers/workflow.(*Controller).Reconcile\n\tgithub.com/tinkerbell/tink/pkg/controllers/workflow/controller.go:60\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227\nruntime.goexit\n\truntime/asm_amd64.s:1581\nvalidating workflow template\ngithub.com/tinkerbell/tink/workflow.Parse\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:37\ngithub.com/tinkerbell/tink/workflow.RenderTemplateHardware\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:95\ngithub.com/tinkerbell/tink/pkg/controllers/workflow.(*Controller).processNewWorkflow\n\tgithub.com/tinkerbell/tink/pkg/controllers/workflow/controller.go:90\ngithub.com/tinkerbell/tink/pkg/controllers/workflow.(*Controller).Reconcile\n\tgithub.com/tinkerbell/tink/pkg/controllers/workflow/controller.go:60\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227\nruntime.goexit\n\truntime/asm_amd64.s:1581","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:317\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227"}
This repeats until timeout.
So if we check on some of the cluster objects we can see this:
root@eksa-admin:~# kubectl -n eksa-system get ma
NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION
my-eksa-cluster-md-0-798f5b8594-ntqnb my-eksa-cluster Pending 6m22s v1.23.7-eks-1-23-4
my-eksa-cluster-p5wgl my-eksa-cluster Provisioning 6m22s v1.23.7-eks-1-23-4
root@eksa-admin:~# kubectl -n eksa-system get tinkerbellmachine
NAME CLUSTER STATE READY INSTANCEID MACHINE
my-eksa-cluster-control-plane-template-1661524403587-6lcgk my-eksa-cluster tinkerbell://eksa-system/eksa-node-cp-001 my-eksa-cluster-p5wgl
my-eksa-cluster-md-0-1661524403588-vgzm6 my-eksa-cluster my-eksa-cluster-md-0-798f5b8594-ntqnb
root@eksa-admin:~# kubectl -n eksa-system get tpl
NAME STATE
my-eksa-cluster-control-plane-template-1661524403587-6lcgk
root@eksa-admin:~# kubectl -n eksa-system get hw
NAME STATE
eksa-node-cp-001
eksa-node-dp-001
root@eksa-admin:~# kubectl -n eksa-system get wf
NAME TEMPLATE STATE
my-eksa-cluster-control-plane-template-1661524403587-6lcgk my-eksa-cluster-control-plane-template-1661524403587-6lcgk
The empty state fields are concerning, so let's check out the workflow:
root@eksa-admin:~# kubectl -n eksa-system describe wf
Name: my-eksa-cluster-control-plane-template-1661524403587-6lcgk
Namespace: eksa-system
Labels: <none>
Annotations: <none>
API Version: tinkerbell.org/v1alpha1
Kind: Workflow
Metadata:
Creation Timestamp: 2022-08-26T14:33:25Z
Generation: 1
Managed Fields:
API Version: tinkerbell.org/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:ownerReferences:
.:
k:{"uid":"04162f45-d584-4d8f-a208-ba3bf4a31c3f"}:
f:spec:
.:
f:hardwareMap:
.:
f:device_1:
f:templateRef:
Manager: manager
Operation: Update
Time: 2022-08-26T14:33:25Z
Owner References:
API Version: infrastructure.cluster.x-k8s.io/v1beta1
Controller: true
Kind: TinkerbellMachine
Name: my-eksa-cluster-control-plane-template-1661524403587-6lcgk
UID: 04162f45-d584-4d8f-a208-ba3bf4a31c3f
Resource Version: 1655
UID: 109d9e75-6c10-43d9-b82c-d6d7ec938c90
Spec:
Hardware Map:
device_1: 10:70:fd:7f:99:a2
Template Ref: my-eksa-cluster-control-plane-template-1661524403587-6lcgk
Events: <none>
And its template:
root@eksa-admin:~# kubectl -n eksa-system describe tpl
Name: my-eksa-cluster-control-plane-template-1661524403587-6lcgk
Namespace: eksa-system
Labels: <none>
Annotations: <none>
API Version: tinkerbell.org/v1alpha1
Kind: Template
Metadata:
Creation Timestamp: 2022-08-26T14:33:25Z
Generation: 1
Managed Fields:
API Version: tinkerbell.org/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:ownerReferences:
.:
k:{"uid":"04162f45-d584-4d8f-a208-ba3bf4a31c3f"}:
f:spec:
.:
f:data:
Manager: manager
Operation: Update
Time: 2022-08-26T14:33:25Z
Owner References:
API Version: infrastructure.cluster.x-k8s.io/v1beta1
Kind: TinkerbellMachine
Name: my-eksa-cluster-control-plane-template-1661524403587-6lcgk
UID: 04162f45-d584-4d8f-a208-ba3bf4a31c3f
Resource Version: 1654
UID: 1c932eba-9c10-464e-bf37-125d4bb13181
Spec:
Data: global_timeout: 6000
id: ""
name: my-eksa-cluster
tasks:
- actions:
- environment:
COMPRESSED: "true"
DEST_DISK: /dev/sda
IMG_URL: https://anywhere-assets.eks.amazonaws.com/releases/bundles/14/artifacts/raw/1-23/bottlerocket-v1.23.7-eks-d-1-23-4-eks-a-14-amd64.img.gz
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/image2disk:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
name: stream-image
timeout: 600
- environment:
CONTENTS: |
# Version is required, it will change as we support
# additional settings
version = 1
# "eno1" is the interface name
# Users may turn on dhcp4 and dhcp6 via boolean
[enp1s0f0np0]
dhcp4 = true
dhcp6 = false
# Define this interface as the "primary" interface
# for the system. This IP is what kubelet will use
# as the node IP. If none of the interfaces has
# "primary" set, we choose the first interface in
# the file
primary = true
DEST_DISK: /dev/sda12
DEST_PATH: /net.toml
DIRMODE: "0755"
FS_TYPE: ext4
GID: "0"
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
name: write-netplan
pid: host
timeout: 90
- environment:
BOOTCONFIG_CONTENTS: |
kernel {
console = "ttyS1,115200n8"
}
init {
systemd.log_level=debug
}
DEST_DISK: /dev/sda12
DEST_PATH: /bootconfig.data
DIRMODE: "0700"
FS_TYPE: ext4
GID: "0"
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
name: write-bootconfig
pid: host
timeout: 90
- environment:
DEST_DISK: /dev/sda12
DEST_PATH: /user-data.toml
DIRMODE: "0700"
FS_TYPE: ext4
GID: "0"
HEGEL_URLS: http://147.75.202.242:50061,http://147.75.202.253:50061
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
name: write-user-data
pid: host
timeout: 90
- image: public.ecr.aws/eks-anywhere/tinkerbell/hub/reboot:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
name: reboot-image
pid: host
timeout: 90
volumes:
- /worker:/worker
name: ""
worker: ""
version: "0.1"
Events: <none>
Anything else we need to know?: What else can I gather for you?
Environment:
- EKS Anywhere Release: 0.11.1
- Equinix Metal
Hey @cprivitere , thanks for reporting this. Would you mind modifying your TinkerbellTemplateConfig? Let me know if that helps, thanks!
diff --git a/current.yaml b/suggested.yaml
index 680348f..126877c 100644
--- a/current.yaml
+++ b/suggested.yaml
@@ -82,4 +82,5 @@ spec:
timeout: 90
volumes:
- /worker:/worker
+ worker: '{{.device_1}}'
version: "0.1"
Sure, added that but it made no difference. Here's the new output from tink controller.
root@eksa-admin:~# kubectl -n eksa-system logs tink-controller-manager-7cbf8c4d66-5bcsl
{"level":"info","ts":1661538534.7211745,"caller":"tink-controller/main.go:106","msg":"no config file found","service":"github.com/tinkerbell/tink"}
{"level":"info","ts":1661538534.721232,"caller":"tink-controller/main.go:60","msg":"starting controller version 8011b72","service":"github.com/tinkerbell/tink"}
{"level":"info","ts":1661538534.811302,"logger":"fallback","caller":"manager/internal.go:362","msg":"Starting server","path":"/metrics","kind":"metrics","addr":"[::]:42173"}
{"level":"info","ts":1661538534.811356,"logger":"fallback","caller":"manager/internal.go:362","msg":"Starting server","kind":"health probe","addr":"[::]:38793"}
I0826 18:28:54.911702 1 leaderelection.go:248] attempting to acquire leader lease eksa-system/tink-leader-election...
I0826 18:28:54.924157 1 leaderelection.go:258] successfully acquired lease eksa-system/tink-leader-election
{"level":"info","ts":1661538534.924403,"logger":"fallback.controller.workflow","caller":"controller/controller.go:178","msg":"Starting EventSource","reconciler group":"tinkerbell.org","reconciler kind":"Workflow","source":"kind source: *v1alpha1.Workflow"}
{"level":"info","ts":1661538534.9245145,"logger":"fallback.controller.workflow","caller":"controller/controller.go:186","msg":"Starting Controller","reconciler group":"tinkerbell.org","reconciler kind":"Workflow"}
{"level":"info","ts":1661538534.924631,"logger":"fallback.controller.workflow","caller":"controller/controller.go:220","msg":"Starting workers","reconciler group":"tinkerbell.org","reconciler kind":"Workflow","worker count":1}
{"level":"info","ts":1661538581.1293182,"logger":"fallback.controller.workflow","caller":"workflow/controller.go:40","msg":"Reconciling","reconciler group":"tinkerbell.org","reconciler kind":"Workflow","name":"my-eksa-cluster-control-plane-template-1661538579174-bt6n6","namespace":"eksa-system"}
{"level":"error","ts":1661538581.2307472,"logger":"fallback.controller.workflow","caller":"controller/controller.go:317","msg":"Reconciler error","reconciler group":"tinkerbell.org","reconciler kind":"Workflow","name":"my-eksa-cluster-control-plane-template-1661538579174-bt6n6","namespace":"eksa-system","error":"validating workflow template: name cannot be empty","errorVerbose":"name cannot be empty\ngithub.com/tinkerbell/tink/workflow.validate\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:127\ngithub.com/tinkerbell/tink/workflow.Parse\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:36\ngithub.com/tinkerbell/tink/workflow.RenderTemplateHardware\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:95\ngithub.com/tinkerbell/tink/pkg/controllers/workflow.(*Controller).processNewWorkflow\n\tgithub.com/tinkerbell/tink/pkg/controllers/workflow/controller.go:90\ngithub.com/tinkerbell/tink/pkg/controllers/workflow.(*Controller).Reconcile\n\tgithub.com/tinkerbell/tink/pkg/controllers/workflow/controller.go:60\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227\nruntime.goexit\n\truntime/asm_amd64.s:1581\nvalidating workflow template\ngithub.com/tinkerbell/tink/workflow.Parse\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:37\ngithub.com/tinkerbell/tink/workflow.RenderTemplateHardware\n\tgithub.com/tinkerbell/tink/workflow/template_validator.go:95\ngithub.com/tinkerbell/tink/pkg/controllers/workflow.(*Controller).processNewWorkflow\n\tgithub.com/tinkerbell/tink/pkg/controllers/workflow/controller.go:90\ngithub.com/tinkerbell/tink/pkg/controllers/workflow.(*Controller).Reconcile\n\tgithub.com/tinkerbell/tink/pkg/controllers/workflow/controller.go:60\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227\nruntime.goexit\n\truntime/asm_amd64.s:1581","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:317\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227"}
Here's the new my-eksa-cluster.yaml
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: my-eksa-cluster
spec:
clusterNetwork:
cniConfig:
cilium: {}
pods:
cidrBlocks:
- 192.168.0.0/16
services:
cidrBlocks:
- 10.96.0.0/12
controlPlaneConfiguration:
count: 1
endpoint:
host: "147.75.202.254"
machineGroupRef:
kind: TinkerbellMachineConfig
name: my-eksa-cluster-cp
datacenterRef:
kind: TinkerbellDatacenterConfig
name: my-eksa-cluster
kubernetesVersion: "1.23"
managementCluster:
name: my-eksa-cluster
workerNodeGroupConfigurations:
- count: 1
machineGroupRef:
kind: TinkerbellMachineConfig
name: my-eksa-cluster
name: md-0
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellDatacenterConfig
metadata:
name: my-eksa-cluster
spec:
tinkerbellIP: "147.75.202.253"
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellMachineConfig
metadata:
name: my-eksa-cluster-cp
spec:
hardwareSelector:
type: cp
osFamily: bottlerocket
templateRef:
kind: TinkerbellTemplateConfig
name: my-eksa-cluster
users:
- name: ec2-user
sshAuthorizedKeys:
- ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDCwAFhyyBY/dLK7fq2redJswiJ/ecViaTMqnJYeSX+tWZ3qJFWWkZPDEXQ1Hpf6vguIhj7SRiWdxMtxifBvJyTRdXLdQQ+ueGqJerYbx6qKxvKvZ8ytqjX7dvLv34VbZeM3fOOTNYAnpb7sW2jJ384DKaQ8AQpqIJn+t8PkqKxNMyY8nbbg7X0SZWeGgFg+z8BibxurRWsv7ZD6ujlj4LuXPV8wL0K21HKDLkiBvgj6IdArL6vwSbXKe0VQByWkwVCVQcP16UVcbGPEKiI6/NYDqvb7931goch3Et7qJHrcg0y2YLChzyZujlCCvFPCK3XJpn7lhIJiiJEGAYe08/ANx6lYCiVk5CXpvBNLdioEQWefaXq6ohlknU5d7ZUB7VkRvk62D8T/Hl27ml+Y0ElmZcD2vTOJ1EZFkmJmHpVu1r7uj6wTOOZCPmGkbB+H1fDiX/BCmaKCW41ePr1SUz6v4NnirCZd+zFUcmpBObQcwgHXcKlp7tqdgF7b6ySQEbcRQAIuLUrd/KoZJm8f/UpAzL8jDwctZDZr1Z4NGqT3ZhViF79Tuo2gNg19qL8EGVMMAKj3U5gvXGdf00JKHOtNieiTBmVcnaw2w0+7Vt1nVaTy2v2cnxMr688dqta5Bv8VFFuRVoUVT4sir45OzwMUQAAydK9BTQ1FFNBaCpZXQ==
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellMachineConfig
metadata:
name: my-eksa-cluster
spec:
hardwareSelector:
type: dp
osFamily: bottlerocket
templateRef:
kind: TinkerbellTemplateConfig
name: my-eksa-cluster
users:
- name: ec2-user
sshAuthorizedKeys:
- ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDCwAFhyyBY/dLK7fq2redJswiJ/ecViaTMqnJYeSX+tWZ3qJFWWkZPDEXQ1Hpf6vguIhj7SRiWdxMtxifBvJyTRdXLdQQ+ueGqJerYbx6qKxvKvZ8ytqjX7dvLv34VbZeM3fOOTNYAnpb7sW2jJ384DKaQ8AQpqIJn+t8PkqKxNMyY8nbbg7X0SZWeGgFg+z8BibxurRWsv7ZD6ujlj4LuXPV8wL0K21HKDLkiBvgj6IdArL6vwSbXKe0VQByWkwVCVQcP16UVcbGPEKiI6/NYDqvb7931goch3Et7qJHrcg0y2YLChzyZujlCCvFPCK3XJpn7lhIJiiJEGAYe08/ANx6lYCiVk5CXpvBNLdioEQWefaXq6ohlknU5d7ZUB7VkRvk62D8T/Hl27ml+Y0ElmZcD2vTOJ1EZFkmJmHpVu1r7uj6wTOOZCPmGkbB+H1fDiX/BCmaKCW41ePr1SUz6v4NnirCZd+zFUcmpBObQcwgHXcKlp7tqdgF7b6ySQEbcRQAIuLUrd/KoZJm8f/UpAzL8jDwctZDZr1Z4NGqT3ZhViF79Tuo2gNg19qL8EGVMMAKj3U5gvXGdf00JKHOtNieiTBmVcnaw2w0+7Vt1nVaTy2v2cnxMr688dqta5Bv8VFFuRVoUVT4sir45OzwMUQAAydK9BTQ1FFNBaCpZXQ==
---
{}
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellTemplateConfig
metadata:
name: my-eksa-cluster
spec:
template:
global_timeout: 6000
id: ""
name: my-eksa-cluster
tasks:
- actions:
- environment:
COMPRESSED: "true"
DEST_DISK: /dev/sda
IMG_URL: https://anywhere-assets.eks.amazonaws.com/releases/bundles/15/artifacts/raw/1-23/bottlerocket-v1.23.7-eks-d-1-23-4-eks-a-15-amd64.img.gz
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/image2disk:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
name: stream-image
timeout: 600
- environment:
CONTENTS: |
# Version is required, it will change as we support
# additional settings
version = 1
# "eno1" is the interface name
# Users may turn on dhcp4 and dhcp6 via boolean
[enp1s0f0np0]
dhcp4 = true
dhcp6 = false
# Define this interface as the "primary" interface
# for the system. This IP is what kubelet will use
# as the node IP. If none of the interfaces has
# "primary" set, we choose the first interface in
# the file
primary = true
DEST_DISK: /dev/sda12
DEST_PATH: /net.toml
DIRMODE: "0755"
FS_TYPE: ext4
GID: "0"
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
name: write-netplan
pid: host
timeout: 90
- environment:
BOOTCONFIG_CONTENTS: |
kernel {
console = "ttyS1,115200n8"
}
init {
systemd.log_level=debug
}
DEST_DISK: /dev/sda12
DEST_PATH: /bootconfig.data
DIRMODE: "0700"
FS_TYPE: ext4
GID: "0"
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
name: write-bootconfig
pid: host
timeout: 90
- environment:
DEST_DISK: /dev/sda12
DEST_PATH: /user-data.toml
DIRMODE: "0700"
FS_TYPE: ext4
GID: "0"
HEGEL_URLS: http://147.75.202.242:50061,http://147.75.202.253:50061
MODE: "0644"
UID: "0"
image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
name: write-user-data
pid: host
timeout: 90
- image: public.ecr.aws/eks-anywhere/tinkerbell/hub/reboot:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-15
name: reboot-image
pid: host
timeout: 90
volumes:
- /worker:/worker
worker: '{{.device_1}}'
version: "0.1"
Ok, I gave it a go adding this full block at the end that I saw in the testdata in the project:
name: my-eksa-cluster
volumes:
- /dev:/dev
- /dev/console:/dev/console
- /lib/firmware:/lib/firmware:ro
worker: '{{.device_1}}'
Cluster creates successfully now.
root@eksa-admin:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
147.75.202.243 Ready control-plane,master 5m16s v1.23.7-eks-7709a84
147.75.202.244 Ready <none> 2m32s v1.23.7-eks-7709a84
Given the error messages from tink-controller, it seems the name:
field for sure was the thing it wanted. I don't know about which of the others are required or not. Either way, this documentation needs to be updated with the correct values: https://anywhere.eks.amazonaws.com/docs/reference/clusterspec/baremetal/#advanced-bare-metal-cluster-configuration
Additionally, those docs have an issue where they have old eks image references baked into them, the docs will either need a way for users to generate a default template file to use when eks-a gets updated, or the docs themselves will need to be updated every time the images are updated. Or some better system of templating that ignores the image versions will need to be created.
Ah nice. Glad to hear you got it working! I will open a PR for the docs. Thanks for working through this!
PR opened. #3184