kubernetes-retired/kube-aws

[0.15.x] ETCD stack template issue

jorge07 opened this issue · 8 comments

Given:

worker:
  nodePoolRollingStrategy: Parallel
  nodePools:
  - name: app
    subnets:
    - name: staging-subnet-1
    - name: staging-subnet-2
    - name: staging-subnet-3
  ...
  - name: ci-master
    subnets:
    - name: staging-subnet-2
  ...

Then kube-aws0.15.0 diff --color
Outputs:

panic: interface conversion: interface {} is nil, not map[string]interface {}

goroutine 1 [running]:
github.com/kubernetes-incubator/kube-aws/core/root.getInstanceScriptUserdata(0xc000796000, 0x1ede, 0xc0003696e0, 0xe, 0x3, 0xc000434080, 0x2, 0x4)
        /Users/dave/code/hcom/src/github.com/kubernetes-incubator/kube-aws/core/root/stack.go:44 +0x3b5
github.com/kubernetes-incubator/kube-aws/core/root.(*Cluster).Diff(0xc00069c140, 0xc000200cb0, 0x1, 0x1, 0xffffffffffffffff, 0x946, 0x0, 0x0, 0x1eefb94, 0x1a)
        /Users/dave/code/hcom/src/github.com/kubernetes-incubator/kube-aws/core/root/cluster.go:407 +0xca5
github.com/kubernetes-incubator/kube-aws/cmd.runCmdDiff(0x2b6cc20, 0xc000200dd0, 0x0, 0x1, 0x0, 0x0)
        /Users/dave/code/hcom/src/github.com/kubernetes-incubator/kube-aws/cmd/diff.go:61 +0x2dd
github.com/kubernetes-incubator/kube-aws/vendor/github.com/spf13/cobra.(*Command).execute(0x2b6cc20, 0xc000200db0, 0x1, 0x1, 0x2b6cc20, 0xc000200db0)
        /Users/dave/code/hcom/src/github.com/kubernetes-incubator/kube-aws/vendor/github.com/spf13/cobra/command.go:826 +0x460
github.com/kubernetes-incubator/kube-aws/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x2b6c220, 0xc0001e9f20, 0x103db0a, 0x2b32d40)
        /Users/dave/code/hcom/src/github.com/kubernetes-incubator/kube-aws/vendor/github.com/spf13/cobra/command.go:914 +0x2fb
github.com/kubernetes-incubator/kube-aws/vendor/github.com/spf13/cobra.(*Command).Execute(...)
        /Users/dave/code/hcom/src/github.com/kubernetes-incubator/kube-aws/vendor/github.com/spf13/cobra/command.go:864
main.main()
        /Users/dave/code/hcom/src/github.com/kubernetes-incubator/kube-aws/main.go:11 +0x31

Added some logs to see what's failing:

generating assets for control-plane, network, etcd, app, k8s-ci-master
Iteratin over:  staging
Iteratin over:  arn:aws:cloudformation:us-east-1:260056389714:stack/staging-Network-1UZX8IJE1A2VI/4f660400-043e-11e9-b9bc-0e8cd4caae34
Iteratin over:  arn:aws:cloudformation:us-east-1:260056389714:stack/staging-Etcd-78AY3H5CXGAX/652f1a60-043e-11e9-aef3-0a52d71b881a Etcdv3dot3i0LC
Name:  Etcdv3dot3i0LC
Stack template:  {"AWSTemplateFormatVersion":"2010-09-09","Description":"kube-aws etcd stack for staging","Parameters":{"NetworkStackName":{"Type":"String","Description":"The name of a network stack used to import values into this stack"}},"Resources":{"IAMInstanceProfileEtcd":{"Properties":{"Path":"/","Roles":[{"Ref":"IAMRoleEtcd"}]},"Type":"AWS::IAM::InstanceProfile"},"IAMManagedPolicyEtcd":{"Type":"AWS::IAM::ManagedPolicy","Properties":{"Description":"Policy for managing kube-aws k8s etcd nodes","Path":"/","PolicyDocument":{"Version":"2012-10-17","Statement":[{"Action":"kms:Decrypt","Effect":"Allow","Resource":"arn:aws:kms:us-east-1:260056389714:key/32c221f3-baea-4eac-8d69-988e7b070153"},{"Action":"ec2:DescribeTags","Effect":"Allow","Resource":"*"},{"Action":"ec2:DescribeVolumes","Effect":"Allow","Resource":"*"},{"Action":"ec2:AttachVolume","Effect":"Allow","Resource":"*"},{"Action":"ec2:DescribeVolumeStatus","Effect":"Allow","Resource":"*"},{"Action":"ec2:AssociateAddress","Effect":"Allow","Resource":"*"},{"Effect":"Allow","Action":["s3:GetObject"],"Resource":"arn:aws:s3:::kube-aws-us-east-1-staging/kube-aws/clusters/staging/exported/stacks/etcd/userdata-etcd*"},{"Effect":"Allow","Action":["s3:ListBucket"],"Resource":"arn:aws:s3:::kube-aws-us-east-1-staging"},{"Effect":"Allow","Action":["s3:List*","s3:GetObject*"],"Resource":"arn:aws:s3:::kube-aws-us-east-1-staging","Condition":{"StringLike":{"s3:prefix":{"Fn::Join":["",[{"Fn::Join":["",["kube-aws/clusters/staging/instances/",{"Fn::Select":["2",{"Fn::Split":["/",{"Ref":"AWS::StackId"}]}]},"/etcd-snapshots"]]},"/*"]]}}}},{"Effect":"Allow","Action":["s3:*"],"Resource":{"Fn::Join":["",["arn:aws:s3:::",{"Fn::Join":["",["kube-aws-us-east-1-staging/kube-aws/clusters/staging/instances/",{"Fn::Select":["2",{"Fn::Split":["/",{"Ref":"AWS::StackId"}]}]},"/etcd-snapshots"]]},"/*"]]}},{"Action":"ec2:DescribeInstances","Resource":"*","Effect":"Allow"}]}}},"IAMRoleEtcd":{"Properties":{"AssumeRolePolicyDocument":{"Statement":[{"Action":["sts:AssumeRole"],"Effect":"Allow","Principal":{"Service":["ec2.amazonaws.com"]}}],"Version":"2012-10-17"},"Path":"/","ManagedPolicyArns":[{"Ref":"IAMManagedPolicyEtcd"}]},"Type":"AWS::IAM::Role"},"Etcd0EIP":{"Properties":{"Domain":"vpc"},"Type":"AWS::EC2::EIP"},"Etcd0EBS":{"Properties":{"AvailabilityZone":"us-east-1a","Size":"40","VolumeType":"gp2","Tags":[{"Key":"Name","Value":"staging-kube-aws-etcd-0"},{"Key":"kube-aws:etcd:index","Value":"0"},{"Key":"kube-aws:etcd:eip-allocation-id","Value":{"Fn::GetAtt":["Etcd0EIP","AllocationId"]}},{"Key":"kube-aws:etcd:advertised-hostname","Value":{"Fn::Join":[".",[{"Fn::Join":["-",["ec2",{"Fn::Join":["-",{"Fn::Split":[".",{"Ref":"Etcd0EIP"}]}]}]]},"compute-1.amazonaws.com"]]}},{"Key":"kube-aws:etcd:name","Value":"etcd0"}]},"Type":"AWS::EC2::Volume"},"Etcd0":{"Type":"AWS::AutoScaling::AutoScalingGroup","Properties":{"HealthCheckGracePeriod":600,"HealthCheckType":"EC2","LaunchConfigurationName":{"Ref":"Etcd0LC"},"MaxSize":"1","MetricsCollection":[{"Granularity":"1Minute"}],"MinSize":"1","Tags":[{"Key":"instanceRole","PropagateAtLaunch":"true","Value":"etcd"},{"Key":"kubernetes.io/cluster/staging","PropagateAtLaunch":"true","Value":"owned"},{"Key":"kube-aws:etcd_version","PropagateAtLaunch":"true","Value":"v3.2.26"},{"Key":"Name","PropagateAtLaunch":"true","Value":"staging-etcd-kube-aws-etcd-0"},{"Key":"kube-aws:role","PropagateAtLaunch":"true","Value":"etcd"}],"VPCZoneIdentifier":[{"Fn::ImportValue":"subnet-1-staging-us-east-1"}]},"CreationPolicy":{"ResourceSignal":{"Count":"1","Timeout":"PT15M"}},"UpdatePolicy":{"AutoScalingRollingUpdate":{"MinInstancesInService":"0","MaxBatchSize":"1","WaitOnResourceSignals":"true","PauseTime":"PT15M"}},"Metadata":{"AWS::CloudFormation::Init":{"configSets":{"etcd-server":["etcd-server-env"]},"etcd-server-env":{"files":{"/var/run/coreos/etcd-environment":{"content":{"Fn::Join":["",["ETCD_INITIAL_CLUSTER='","etcd0","=https://",{"Fn::Join":[".",[{"Fn::Join":["-",["ec2",{"Fn::Join":["-",{"Fn::Split":[".",{"Ref":"Etcd0EIP"}]}]}]]},"compute-1.amazonaws.com"]]},":2380","'\n"]]}},"/var/run/coreos/etcdadm-environment":{"content":{"Fn::Join":["",["ETCD_ENDPOINTS='","https://",{"Fn::Join":[".",[{"Fn::Join":["-",["ec2",{"Fn::Join":["-",{"Fn::Split":[".",{"Ref":"Etcd0EIP"}]}]}]]},"compute-1.amazonaws.com"]]},":2379","'\n","AWS_DEFAULT_REGION='","us-east-1","'\n","KUBERNETES_CLUSTER='","staging","'\n","ETCDCTL_CACERT='","/etc/ssl/certs/etcd-trusted-ca.pem","'\n","ETCDCTL_CERT='","/etc/ssl/certs/etcd-client.pem","'\n","ETCDCTL_KEY='","/etc/ssl/certs/etcd-client-key.pem","'\n","ETCDCTL_CA_FILE='","/etc/ssl/certs/etcd-trusted-ca.pem","'\n","ETCDCTL_CERT_FILE='","/etc/ssl/certs/etcd-client.pem","'\n","ETCDCTL_KEY_FILE='","/etc/ssl/certs/etcd-client-key.pem","'\n","ETCDADM_MEMBER_SYSTEMD_SERVICE_NAME='","etcd-member","'\n","ETCDADM_CLUSTER_SNAPSHOTS_S3_URI='",{"Fn::Join":["",["s3://",{"Fn::Join":["",["kube-aws-us-east-1-staging/kube-aws/clusters/staging/instances/",{"Fn::Select":["2",{"Fn::Split":["/",{"Ref":"AWS::StackId"}]}]},"/etcd-snapshots"]]}]]},"'\n","ETCDADM_STATE_FILES_DIR='","/var/run/coreos/etcdadm","'\n","ETCDADM_MEMBER_ENV_FILE='","/var/run/coreos/etcdadm/etcd-member.env","'\n","ETCDADM_MEMBER_COUNT='","1","'\n","ETCDADM_MEMBER_INDEX='","0","'\n","ETCD_VERSION='","v3.2.26","'\n","ETCD_OPTS='","--quota-backend-bytes=2147483648","'\n"]]}}}}}},"DependsOn":["Etcd0EIP","Etcd0EBS"]},"Etcd0LC":{"Properties":{"BlockDeviceMappings":[{"DeviceName":"/dev/xvda","Ebs":{"VolumeSize":"40","VolumeType":"gp2"}}],"IamInstanceProfile":{"Ref":"IAMInstanceProfileEtcd"},"ImageId":"ami-0b1db01d775d666c2","InstanceType":"m4.large","KeyName":"staging","SecurityGroups":[{"Fn::ImportValue":{"Fn::Sub":"${NetworkStackName}-EtcdSecurityGroup"}}],"PlacementTenancy":"default","UserData":{"Fn::Base64":{"Fn::Join":["\n",["#!/bin/bash -xe","# s3-part-fingerprint: e5f4ab33205dbb176df76bce09c0cbf766228d14255dbb1dc753a9c08c05a159",{"Fn::Sub":"echo 'KUBE_AWS_STACK_NAME=${AWS::StackName}' >>/var/run/coreos/etcd-node.env"},"echo 'KUBE_AWS_ETCD_INDEX=0' \u003e\u003e /var/run/coreos/etcd-node.env\n . /etc/environment\nexport COREOS_PRIVATE_IPV4 COREOS_PRIVATE_IPV6 COREOS_PUBLIC_IPV4 COREOS_PUBLIC_IPV6\nREGION=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r '.region')\nUSERDATA_FILE=userdata-etcd\n\nrun() {\n  bin=\"$1\"; shift\n  while ! /usr/bin/rkt run \\\n     --net=host \\\n     --volume=dns,kind=host,source=/etc/resolv.conf,readOnly=true --mount volume=dns,target=/etc/resolv.conf  \\\n     --volume=awsenv,kind=host,source=/var/run/coreos,readOnly=false --mount volume=awsenv,target=/var/run/coreos \\\n     --volume=etcdenv,kind=host,source=/var/run/coreos/etcd-node.env,readOnly=false --mount volume=etcdenv,target=/var/run/coreos/etcd-node.env  \\\n     --trust-keys-from-https \\\n     quay.io/coreos/awscli:master --exec=$bin -- \"$@\"; do\n    sleep 1\n  done\n}\nrun bash -c \"aws configure set s3.signature_version s3v4; aws s3 --region $REGION cp s3://kube-aws-us-east-1-staging/kube-aws/clusters/staging/exported/stacks/etcd/userdata-etcd-e5f4ab33205dbb176df76bce09c0cbf766228d14255dbb1dc753a9c08c05a159 /var/run/coreos/$USERDATA_FILE\"\n\nINSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)\n\nexec /usr/bin/coreos-cloudinit --from-file /var/run/coreos/$USERDATA_FILE\n"]]}}},"Type":"AWS::AutoScaling::LaunchConfiguration"}},"Outputs":{"Etcd0EIP":{"Description":"The EIP for etcd node 0","Value":{"Ref":"Etcd0EIP"},"Export":{"Name":{"Fn::Sub":"${AWS::StackName}-Etcd0EIP"}}},"StackName":{"Description":"The name of this stack which is used by node pool stacks to import outputs from this stack","Value":{"Ref":"AWS::StackName"}},"Etcd0FQDN":{"Description":"The FQDN for etcd node 0","Value":{"Fn::Join":[".",[{"Fn::Join":["-",["ec2",{"Fn::Join":["-",{"Fn::Split":[".",{"Ref":"Etcd0EIP"}]}]}]]},"compute-1.amazonaws.com"]]}}}}
panic: interface conversion: interface {} is nil, not map[string]interface {}

goroutine 1 [running]:
github.com/kubernetes-incubator/kube-aws/core/root.getInstanceScriptUserdata(0xc000924000, 0x1ede, 0xc0005c89e0, 0xe, 0x3, 0xc0008dad20, 0x2, 0x4)
	/Users/jorge/go/src/github.com/kubernetes-incubator/kube-aws/core/root/stack.go:46 +0x4fb
github.com/kubernetes-incubator/kube-aws/core/root.(*Cluster).Diff(0xc000332000, 0xc00006a330, 0x1, 0x1, 0xffffffffffffffff, 0x946, 0x0, 0x0, 0x1eef817, 0x1a)
	/Users/jorge/go/src/github.com/kubernetes-incubator/kube-aws/core/root/cluster.go:408 +0xd92
github.com/kubernetes-incubator/kube-aws/cmd.runCmdDiff(0x2b6bc20, 0xc00006a450, 0x0, 0x1, 0x0, 0x0)
	/Users/jorge/go/src/github.com/kubernetes-incubator/kube-aws/cmd/diff.go:61 +0x2dd
github.com/kubernetes-incubator/kube-aws/vendor/github.com/spf13/cobra.(*Command).execute(0x2b6bc20, 0xc00006a430, 0x1, 0x1, 0x2b6bc20, 0xc00006a430)
	/Users/jorge/go/src/github.com/kubernetes-incubator/kube-aws/vendor/github.com/spf13/cobra/command.go:826 +0x460
github.com/kubernetes-incubator/kube-aws/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x2b6b220, 0xc0001fdf20, 0x103d8da, 0x2b31d40)
	/Users/jorge/go/src/github.com/kubernetes-incubator/kube-aws/vendor/github.com/spf13/cobra/command.go:914 +0x2fb
github.com/kubernetes-incubator/kube-aws/vendor/github.com/spf13/cobra.(*Command).Execute(...)
	/Users/jorge/go/src/github.com/kubernetes-incubator/kube-aws/vendor/github.com/spf13/cobra/command.go:864
main.main()
	/Users/jorge/go/src/github.com/kubernetes-incubator/kube-aws/main.go:11 +0x31

I'll investigate and see if I can see the root of this issue. As there's a mayor ETCD upgrade in this version, help is welcome ;)

So I've this in the etcd stack (Single Node ETCD Cluster)

      "Etcd0EIP":{  },
      "Etcd0EBS":{  },
      "Etcd0":{  },
      "Etcd0LC":{  }

Expecting Etcdv3dot3i0LC to be somewhere but it doesn't.

Here the ETCD config:

# ETCD Cluster
etcd:
  count: 1
  instanceType: m4.large
  instanceTags:
    instanceRole: etcd
  rootVolume:
    size: 40
    type: gp2
  dataVolume:
    size: 40
    type: gp2
  subnets:
    - name: staging-subnet-1
    - name: staging-subnet-2
    - name: staging-subnet-3
  snapshot:
    automated: true
  disasterRecovery:
    automated: true

Thanks in advance

Any idea if this issue still exists on master?

Potentially yes

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.