equinix-labs/terraform-equinix-metal-eks-anywhere

Verify that all UEFI plans can be used

Opened this issue · 9 comments

Use the plans.
Ensure they provision correctly and are usable as CP and DP nodes.
Add the plans to the list of known good.

  • m3.small.x86
  • m3.large.x86
  • n3.xlarge.x86
  • a3.large.x86

Couldn't find enough a3.larges to test.

n3.xlarge.x86 has its tinkerbell workload immediately fail.

From kubectl get wf

NAME                                                         TEMPLATE                                                     STATE
my-eksa-cluster-control-plane-template-1664229893447-szn92   my-eksa-cluster-control-plane-template-1664229893447-szn92   STATE_FAILED
root@eksa-yn

From the tink-server logs

{"level":"info","ts":1664230139.677456,"caller":"server/kubernetes_api_workflow.go:200","msg":"updating workflow in Kubernetes","service":"github.com/tinkerbell/tink","actionName":"stream-image","status":"STATE_RUNNING","workflowID":"my-eksa-cluster-control-plane-template-1664229893447-szn92","taskName":"my-eksa-cluster","worker":"b4:96:91:b7:9e:a8"}
{"level":"info","ts":1664230141.5669775,"caller":"server/kubernetes_api_workflow.go:200","msg":"updating workflow in Kubernetes","service":"github.com/tinkerbell/tink","actionName":"stream-image","status":"STATE_FAILED","workflowID":"my-eksa-cluster-control-plane-template-1664229893447-szn92","taskName":"my-eksa-cluster","worker":"b4:96:91:b7:9e:a8"}
root@eksa-ync3ub-admin:~# kubectl -n eksa-system get ma
NAME                                    CLUSTER           NODENAME   PROVIDERID   PHASE          AGE   VERSION
my-eksa-cluster-8wzt6                   my-eksa-cluster                           Provisioning   12m   v1.23.9-eks-1-23-5
my-eksa-cluster-md-0-5886c8855f-g8h2d   my-eksa-cluster                           Pending        13m   v1.23.9-eks-1-23-5
root@eksa-ync3ub-admin:~# kubectl -n eksa-system get tinkerbellmachine
NAME                                                         CLUSTER           STATE   READY   INSTANCEID                                         MACHINE
my-eksa-cluster-control-plane-template-1664229893447-szn92   my-eksa-cluster                   tinkerbell://eksa-system/eksa-ync3ub-node-cp-001   my-eksa-cluster-8wzt6
my-eksa-cluster-md-0-1664229893449-6xvnk                     my-eksa-cluster                                                                      my-eksa-cluster-md-0-5886c8855f-g8h2d

capt logs:

root@eksa-ync3ub-admin:~# kubectl -n capt-system logs capt-controller-manager-79c5cd596d-j4ffx  | grep -v "skipping BMCJob creation"
I0926 22:04:50.417103       1 request.go:665] Waited for 1.046601739s due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling/v2beta1?timeout=32s
I0926 22:04:50.521311       1 logr.go:261] controller-runtime/metrics "msg"="Metrics server is starting to listen"  "addr"="localhost:8080"
I0926 22:04:50.522958       1 logr.go:261] controller-runtime/builder "msg"="Registering a mutating webhook"  "GVK"={"Group":"infrastructure.cluster.x-k8s.io","Version":"v1beta1","Kind":"TinkerbellCluster"} "path"="/mutate-infrastructure-cluster-x-k8s-io-v1beta1-tinkerbellcluster"
I0926 22:04:50.523154       1 server.go:146] controller-runtime/webhook "msg"="Registering webhook" "path"="/mutate-infrastructure-cluster-x-k8s-io-v1beta1-tinkerbellcluster"
I0926 22:04:50.523291       1 logr.go:261] controller-runtime/builder "msg"="Registering a validating webhook"  "GVK"={"Group":"infrastructure.cluster.x-k8s.io","Version":"v1beta1","Kind":"TinkerbellCluster"} "path"="/validate-infrastructure-cluster-x-k8s-io-v1beta1-tinkerbellcluster"
I0926 22:04:50.523391       1 server.go:146] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-infrastructure-cluster-x-k8s-io-v1beta1-tinkerbellcluster"
I0926 22:04:50.523541       1 logr.go:261] controller-runtime/builder "msg"="skip registering a mutating webhook, object does not implement admission.Defaulter or WithDefaulter wasn't called"  "GVK"={"Group":"infrastructure.cluster.x-k8s.io","Version":"v1beta1","Kind":"TinkerbellMachine"}
I0926 22:04:50.523574       1 logr.go:261] controller-runtime/builder "msg"="Registering a validating webhook"  "GVK"={"Group":"infrastructure.cluster.x-k8s.io","Version":"v1beta1","Kind":"TinkerbellMachine"} "path"="/validate-infrastructure-cluster-x-k8s-io-v1beta1-tinkerbellmachine"
I0926 22:04:50.523669       1 server.go:146] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-infrastructure-cluster-x-k8s-io-v1beta1-tinkerbellmachine"
I0926 22:04:50.523836       1 logr.go:261] controller-runtime/builder "msg"="skip registering a mutating webhook, object does not implement admission.Defaulter or WithDefaulter wasn't called"  "GVK"={"Group":"infrastructure.cluster.x-k8s.io","Version":"v1beta1","Kind":"TinkerbellMachineTemplate"}
I0926 22:04:50.523876       1 logr.go:261] controller-runtime/builder "msg"="Registering a validating webhook"  "GVK"={"Group":"infrastructure.cluster.x-k8s.io","Version":"v1beta1","Kind":"TinkerbellMachineTemplate"} "path"="/validate-infrastructure-cluster-x-k8s-io-v1beta1-tinkerbellmachinetemplate"
I0926 22:04:50.523967       1 server.go:146] controller-runtime/webhook "msg"="Registering webhook" "path"="/validate-infrastructure-cluster-x-k8s-io-v1beta1-tinkerbellmachinetemplate"
I0926 22:04:50.524095       1 logr.go:261] setup "msg"="starting manager"  "version"="v0.0.0-master+$Format:%H$"
I0926 22:04:50.524197       1 server.go:214] controller-runtime/webhook/webhooks "msg"="Starting webhook server"
I0926 22:04:50.524516       1 internal.go:362]  "msg"="Starting server" "addr"={"IP":"::","Port":9440,"Zone":""} "kind"="health probe"
I0926 22:04:50.524589       1 logr.go:261] controller-runtime/certwatcher "msg"="Updated current TLS certificate"
I0926 22:04:50.524592       1 internal.go:362]  "msg"="Starting server" "addr"={"IP":"127.0.0.1","Port":8080,"Zone":""} "kind"="metrics" "path"="/metrics"
I0926 22:04:50.524711       1 logr.go:261] controller-runtime/webhook "msg"="Serving webhook server"  "host"="" "port"=9443
I0926 22:04:50.524739       1 leaderelection.go:248] attempting to acquire leader lease capt-system/controller-leader-election-capt...
I0926 22:04:50.524777       1 logr.go:261] controller-runtime/certwatcher "msg"="Starting certificate watcher"
I0926 22:04:50.538019       1 leaderelection.go:258] successfully acquired lease capt-system/controller-leader-election-capt
I0926 22:04:50.538415       1 controller.go:178] controller/tinkerbellcluster "msg"="Starting EventSource" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellCluster" "source"="kind source: *v1beta1.TinkerbellCluster"
I0926 22:04:50.538415       1 controller.go:178] controller/tinkerbellmachine "msg"="Starting EventSource" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "source"="kind source: *v1beta1.TinkerbellMachine"
I0926 22:04:50.538469       1 controller.go:178] controller/tinkerbellcluster "msg"="Starting EventSource" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellCluster" "source"="kind source: *v1beta1.Cluster"
I0926 22:04:50.538480       1 controller.go:178] controller/tinkerbellmachine "msg"="Starting EventSource" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "source"="kind source: *v1beta1.Machine"
I0926 22:04:50.538491       1 controller.go:186] controller/tinkerbellcluster "msg"="Starting Controller" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellCluster"
I0926 22:04:50.538509       1 controller.go:178] controller/tinkerbellmachine "msg"="Starting EventSource" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "source"="kind source: *v1beta1.TinkerbellCluster"
I0926 22:04:50.538536       1 controller.go:178] controller/tinkerbellmachine "msg"="Starting EventSource" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "source"="kind source: *v1beta1.Cluster"
I0926 22:04:50.538559       1 controller.go:178] controller/tinkerbellmachine "msg"="Starting EventSource" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "source"="kind source: *v1alpha1.Workflow"
I0926 22:04:50.538584       1 controller.go:178] controller/tinkerbellmachine "msg"="Starting EventSource" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "source"="kind source: *v1alpha1.BMCJob"
I0926 22:04:50.538612       1 controller.go:186] controller/tinkerbellmachine "msg"="Starting Controller" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
I0926 22:04:50.866683       1 controller.go:220] controller/tinkerbellcluster "msg"="Starting workers" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellCluster" "worker count"=10
I0926 22:04:50.866816       1 controller.go:220] controller/tinkerbellmachine "msg"="Starting workers" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "worker count"=10
I0926 22:04:53.881179       1 tinkerbellcluster_controller.go:84] controller/tinkerbellcluster "msg"="TinkerbellCluster object not found" "name"="my-eksa-cluster" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellCluster" "tinkerbellcluster"={"Namespace":"eksa-system","Name":"my-eksa-cluster"}
E0926 22:04:53.984467       1 tinkerbellmachine_controller.go:168]  "msg"="owning cluster is not found, skipping mapping." "error"=null "Namespace"="eksa-system" "TinkerbellCluster"="my-eksa-cluster"
I0926 22:04:53.985442       1 tinkerbellcluster_controller.go:107] controller/tinkerbellcluster "msg"="OwnerCluster is not set yet." "name"="my-eksa-cluster" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellCluster" "tinkerbellcluster"={"Namespace":"eksa-system","Name":"my-eksa-cluster"}
E0926 22:04:54.034855       1 tinkerbellmachine_controller.go:168]  "msg"="owning cluster is not found, skipping mapping." "error"=null "Namespace"="eksa-system" "TinkerbellCluster"="my-eksa-cluster"
I0926 22:04:54.035492       1 tinkerbellcluster_controller.go:241] controller/tinkerbellcluster "msg"="Setting cluster status to ready" "name"="my-eksa-cluster" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellCluster" "tinkerbellcluster"={"Namespace":"eksa-system","Name":"my-eksa-cluster"}
I0926 22:04:54.082024       1 tinkerbellcluster_controller.go:241] controller/tinkerbellcluster "msg"="Setting cluster status to ready" "name"="my-eksa-cluster" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellCluster" "tinkerbellcluster"={"Namespace":"eksa-system","Name":"my-eksa-cluster"}
I0926 22:04:54.179531       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-md-0-1664229893449-6xvnk"} "name"="my-eksa-cluster-md-0-1664229893449-6xvnk" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="Machine Controller has not yet set OwnerRef"
I0926 22:04:54.200924       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-md-0-1664229893449-6xvnk"} "name"="my-eksa-cluster-md-0-1664229893449-6xvnk" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="Machine Controller has not yet set OwnerRef"
I0926 22:04:54.267296       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-md-0-1664229893449-6xvnk"} "name"="my-eksa-cluster-md-0-1664229893449-6xvnk" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="Machine Controller has not yet set OwnerRef"
I0926 22:04:54.267418       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-md-0-1664229893449-6xvnk"} "name"="my-eksa-cluster-md-0-1664229893449-6xvnk" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="Machine Controller has not yet set OwnerRef"
I0926 22:04:54.273700       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-md-0-1664229893449-6xvnk"} "name"="my-eksa-cluster-md-0-1664229893449-6xvnk" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="retrieving bootstrap data: linked Machine's bootstrap.dataSecretName is not available yet"
I0926 22:04:54.366108       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-md-0-1664229893449-6xvnk"} "name"="my-eksa-cluster-md-0-1664229893449-6xvnk" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="retrieving bootstrap data: linked Machine's bootstrap.dataSecretName is not available yet"
I0926 22:04:54.366376       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-md-0-1664229893449-6xvnk"} "name"="my-eksa-cluster-md-0-1664229893449-6xvnk" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="retrieving bootstrap data: linked Machine's bootstrap.dataSecretName is not available yet"
I0926 22:04:55.393502       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-control-plane-template-1664229893447-szn92"} "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="Machine Controller has not yet set OwnerRef"
I0926 22:04:55.414291       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-control-plane-template-1664229893447-szn92"} "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="Machine Controller has not yet set OwnerRef"
I0926 22:04:55.426968       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-control-plane-template-1664229893447-szn92"} "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="Machine Controller has not yet set OwnerRef"
I0926 22:04:55.432441       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-control-plane-template-1664229893447-szn92"} "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="Machine Controller has not yet set OwnerRef"
I0926 22:04:55.476425       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-control-plane-template-1664229893447-szn92"} "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="retrieving bootstrap data: linked Machine's bootstrap.dataSecretName is not available yet"
I0926 22:04:55.486194       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-control-plane-template-1664229893447-szn92"} "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="retrieving bootstrap data: linked Machine's bootstrap.dataSecretName is not available yet"
I0926 22:04:55.525818       1 base.go:446] controller/tinkerbellmachine "msg"="machine is not ready yet" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-control-plane-template-1664229893447-szn92"} "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "reason"="retrieving bootstrap data: linked Machine's bootstrap.dataSecretName is not available yet"
I0926 22:04:55.776631       1 machine.go:414] controller/tinkerbellmachine "msg"="Selected Hardware for machine" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-control-plane-template-1664229893447-szn92"} "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine" "Hardware name"="eksa-ync3ub-node-cp-001"
I0926 22:04:55.987941       1 machine.go:332] controller/tinkerbellmachine "msg"="template for machine does not exist, creating" "TinkerbellMachine"={"Namespace":"eksa-system","Name":"my-eksa-cluster-control-plane-template-1664229893447-szn92"} "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:01.591387       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:01.609542       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:01.633221       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:01.767123       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:01.865250       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:01.920450       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:02.097272       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:02.435225       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:03.094626       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:04.392877       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:06.967239       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:12.107711       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:22.365520       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:09:42.864532       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:10:23.842660       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:11:45.780856       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:14:29.641203       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"
E0926 22:19:57.343321       1 controller.go:317] controller/tinkerbellmachine "msg"="Reconciler error" "error"="workflow failed" "name"="my-eksa-cluster-control-plane-template-1664229893447-szn92" "namespace"="eksa-system" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="TinkerbellMachine"

m3.larges fail on netdog during the boot of bottlerocket. Likely means we need a different network interface name in the m3.large tinkerbelltemplateconfig file.

probably needs to be enp65s0f0

Ok, status update: I have been unable to get m3.large, n3.xlarge, a3.large, or c2.medium to work. We need to work with some bottlerocket experts at AWS to get help debugging what's going wrong.

We're going to try the following configuration change in an attempt to reduce the configuration problems that go along with variability in NIC kernel device naming:

kernel {
  // <console stuff goes here>
}
init {
  net.ifnames=0
}

Is currently m3.small.x86 the only available instance plan for AWS EKS anywhere?

@Paulius0112 yes. The device name that is configured in this project and passed to Bottlerocket must match the kernel device name of the device. We opened #63 in an attempt to workaround this.

https://github.com/equinix-labs/terraform-equinix-metal-eks-anywhere/blob/main/variables.tf#L92-L99 declares a variable that you can use to send plan-specific kernel NIC device name in, you'll want to verify the name and add it to the map.

m3.small.x86 is the most tested plan with this integration. There may be other limitations (gotchas) outside of this plan.

We plan to revisit these issues after the next bottlerocket and EKS-A releases since bottlerocket's next release addresses the NIC naming concerns by allowing for MAC address mapping in addition to NIC name mapping.

@Paulius0112 We didn't bake additional plans into the map because the NIC device name was variable beyond just knowing the plan name. NIC device names are sensitive to the placement within the server (vendor, bus, slot, etc).

In our testing of the EKS-A module, we found this was very consistent with m3.small.x86.