jenkins-x/terraform-aws-eks-jx

Resources not destroyed when using existing cluster

chrislovecnm opened this issue · 5 comments

Summary

When I do a terraform destroy multiple k8s resources are not cleaned up.

Steps to reproduce the behavior

Install JX3 on an existing cluster

Expected behavior

All JX3 k8s resources are removed.

Actual behavior

The following resources are left behind

  1. All namespaces: jx-git-operator jx-production jx-staging jx-vault kuberhealthy nginx secret-infra tekton-piplines
  2. CRDS
  3. ClusterRoleBindings
  4. Two EBS volumes (nexus and another)

Terraform version

The output of terraform version is:

Terraform v0.13.5
+ provider registry.terraform.io/hashicorp/aws v3.75.1
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.5.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.10.0
+ provider registry.terraform.io/hashicorp/local v2.2.2
+ provider registry.terraform.io/hashicorp/null v3.1.1
+ provider registry.terraform.io/hashicorp/random v3.1.2
+ provider registry.terraform.io/hashicorp/template v2.2.0
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.

Module version

v1.18.11

Operating system

Linux running inside of a container

This issue is because the v18 eks module is very restrictive when it comes to security groups. So basically ur node to node and control plane to node connections are not working.
Try this:

cluster_security_group_additional_rules = {
    egress_nodes_ephemeral_ports_tcp = {
      description                = "To node 1025-65535"
      protocol                   = "tcp"
      from_port                  = 1025
      to_port                    = 65535
      type                       = "egress"
      source_node_security_group = true
    }
  }
  # Extend node-to-node security group rules
  node_security_group_additional_rules = {
    ingress_self_all = {
      description = "Node to node all ports/protocols"
      protocol    = "-1"
      from_port   = 0
      to_port     = 0
      type        = "ingress"
      self        = true
    }
    egress_all = {
      description      = "Node all egress"
      protocol         = "-1"
      from_port        = 0
      to_port          = 0
      type             = "egress"
      cidr_blocks      = ["0.0.0.0/0"]
      ipv6_cidr_blocks = ["::/0"]
    }
    ingress_cluster_all = {
      description                   = "Cluster to node all ports/protocols"
      protocol                      = "-1"
      from_port                     = 0
      to_port                       = 0
      type                          = "ingress"
      source_cluster_security_group = true
    }
  }

Actually resources not getting destroyed is most likely a bug, unrelated to what I posted above. I will look into this issue.

I am getting more errors deleting a cluster, which was up and running correctly

Error: error deleting S3 Bucket (logs-foo-20220420182851392800000001): BucketNotEmpty: The bucket you tried to delete is not empty
	status code: 409, request id: 


Error: Kubernetes cluster unreachable: the server has asked for the client to provide credentials

I have set the KUBECONFIG and TF_KUBECONFIG evn variables and terraform-helm is not picking up

KUBE_CONFIG_PATH=/path/to/kubeconfig

Helped with the helm resources.

$ k get configmaps
NAME                          DATA   AGE
config                        1      13d
ingress-config                5      13d
jenkins-x-docker-registry     2      13d
jenkins-x-extensions          2      13d
jx-install-config             1      13d
kapp-config                   1      13d
kube-root-ca.crt              1      13d
lighthouse-external-plugins   1      13d
nexus                         1      13d
plugins                       1      13d

These are not destroyed.

cert-manager           Active   13d
external-dns-private   Active   11d
jx                     Active   13d
jx-git-operator        Active   13d
jx-production          Active   13d
jx-staging             Active   13d
kuberhealthy           Active   13d
nginx                  Active   13d
secret-infra           Active   13d
tekton-pipelines       Active   13d

None of these are cleaned up. Now I don't think we want to delete jx-staging or jx-production :)