upgrading 0.14 to 0.15: etcd migration failed; controller kubelet fails
flah00 opened this issue · 1 comments
flah00 commented
TL;DR
By manually editing the etcd.json.tmpl template I was able to work around one problem. But I have not been able to fix the controller related issue.
Etcd
The export-existing-etcd-state.service
was failing, with the error
Failed to start Exports Kubernetes Values from a remote Etcd cluster
This was because ETCD_ENDPOINTS
was configured to use private host names in /var/run/coreos/etcdadm-environment-migration
. To work around this issue, I had to update stack-templates/etcd.json.tmpl
diff --git a/k9s-zoo/stack-templates/etcd.json.tmpl b/k9s-zoo/stack-templates/etcd.json.tmpl
index a34df2d..8c3dabc 100644
--- a/k9s-zoo/stack-templates/etcd.json.tmpl
+++ b/k9s-zoo/stack-templates/etcd.json.tmpl
@@ -411,9 +411,7 @@
{{ if $.EtcdMigrationEnabled -}}
"/var/run/coreos/etcdadm-environment-migration": {
"content": { "Fn::Join" : [ "", [
- "ETCD_ENDPOINTS='",
- "{{ $.EtcdMigrationExistingEndpoints }}",
- "'\n",
+ "ETCD_ENDPOINTS='https://PUBLIC_HOST_1:2379,https://PUBLIC_HOST_2:2379,https://PUBLIC_HOST_3:2379'",
"AWS_DEFAULT_REGION='",
"{{$.Region}}",
"'\n",
Controller
After I make it beyond etcd, I'm confronted with a kubelet networking error, on the controllers.
Jul 03 18:55:52 HOST.ec2.internal sh[20858]: F0703 18:55:52.153512 20858 server.go:273] failed to run
Kubelet: could not init cloud provider "aws": error finding instance i-006bdcc9632d50a6c: "error listing AWS instances:
\"RequestError: send request failed\\ncaused by: Post https://ec2.us-east-1.amazonaws.com/: dial tcp: lookup ec2.us-east-1.amazonaws.com on [::1]:53: read udp [::1]:50109->[::1]:53: read: connection refused\""
Jul 03 18:55:52 HOST.ec2.internal systemd[1]: kubelet.service: Main process exited, code=exited, status=255/EXCEPTION
flah00 commented
The controller issues are all related to the aws-iam-auth plugin
These issues were not present, when I updated the feature on the 0.14.x branch