[EKS] [request]: Nodegroup should support tagging ASGs

Question

[EKS] [request]: Nodegroup should support tagging ASGs

bhops opened this issue 5 years ago · 122 comments

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request
It would be great if we could pass tags to the underlying ASGs (and tell the ASGs to propagate tags) that are created from the managed node groups for EKS so that the underlying instances/volumes are tagged appropriately.

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Currently, managed node groups for EKS do not support propagating tags to ASGs (and therefore the instances) created by the node group. This leads to EC2 instances that are not tagged according to our requirements for tracking cost, and resource ownership.

Answer 1 · 2019-12-27T13:23:08.000Z

Passing the managed node tags to launch templates "Instance tags" will automatically apply to both EC2 and its volumes. If there are some challenges to do that, creating a separate "Custom Tags" section in the EKS managed node configuration page will also be helpful.

Answer 2 · 2020-01-15T15:51:01.000Z

Workaround to add custom tags to WorkerNodes using EKS managed NodeGroup :

Create a managed worker Node group in EKS console. (Set Minimum & desired count as 1)
EKS creates an ASG in the background. You will find ASG information for EKS NodeGroup details in EKS console. Select ASG associated with managed worker NodeGroup > tags > Add your custom tags for EC2.
Note: Make sure to checkbox "Tag New Instances" while creating new tags.
Terminate the newly launched Ec2 without tags.
Scale up ManagedNodeGroup as per requirement.
After completing above steps, EKS managed Node groups will tag new EC2 Instance with custom tags.

Answer 3 · 2020-02-18T06:26:30.000Z

This is crucial feature that is missing, and is the only reason our department is not moving from manual ASGs to node groups.

Answer 4 · 2020-04-12T12:21:43.000Z

Yes, this is a very important change. We also cannot use this because of the need for tags. Bad practice to use semi-automatic infrastructure as code.

Answer 5 · 2020-04-24T09:19:57.000Z

Any update ?

Answer 6 · 2020-05-11T23:11:41.000Z

any updates ? can you open source node groups.. so, community can contribute ?

Answer 7 · 2020-05-12T22:41:36.000Z

any updates ?

Answer 8 · 2020-05-22T13:43:07.000Z

Is this a duplicate of #374?

Answer 9 · 2020-05-27T15:56:12.000Z

I don't think it's a duplicate. This one is for an API feature to add tags to the ASG created by the API, and also be able to set the flag on the ASG that propagates tags outwards: so it's only an API change to implement the same thing down manually in the workaround above.

#374 is for the EKS Cluster object itself to support propagating tags down, in the way ASGs already do. I imagine #374 would partially work by propagating tags to ASGs, and then turning on ASG tag propagation, rather than duplicating the behaviour.

Answer 10 · 2020-06-05T04:24:47.000Z

Team: Having this functionality available will enable customers to use Cluster Autoscaler's capacity autodiscovery feature instead of forcing them to maintain manual capacity mappings on the command line.

The documentation there isn't super clear (see kubernetes/autoscaler#3198 for documentation updates), but advertising capacity resources to Cluster Autoscaler via ASG tags will make the use of multiple heterogeneous Auto Scaling Groups much easier for customers.

Answer 11 · 2020-06-05T14:59:54.000Z

Team: Having this functionality available will enable customers to use Cluster Autoscaler's capacity autodiscovery feature instead of forcing them to maintain manual capacity mappings on the command line.

The documentation there isn't super clear (see kubernetes/autoscaler#3198 for documentation updates), but advertising capacity resources to Cluster Autoscaler via ASG tags will make the use of multiple heterogeneous Auto Scaling Groups much easier for customers.

@otterley While Managed Nodegroup doesn't support customer provided tags for ASGs today, we do add the necessary tags for CAS auto discovery to the ASG i.e. k8s.io/cluster-autoscaler/enabled and k8s.io/cluster-autoscaler/<CLUSTER NAME>.

Answer 12 · 2020-06-05T16:03:18.000Z

@rtripat Understood. Perhaps I wasn't clear, but I was specifically referring to the ability to autodiscover specific capacity dimensions of an ASG such as cpu, memory, ephemeral storage, GPU, etc.

Answer 13 · 2020-06-07T02:18:11.000Z

Until this feature is ready, I've had success with creating a cloudwatch rule based upon EC2 "pending" status, invoking a lambda that checks the instance_id passed in through the event, checks the instance_id to see if it's part of a managed node cluster, then adds the appropriate tags. I'm doing this all through Terraform with the spin up of the eks cluster.

Obviously would be much easier with a tags option! 😛

Answer 14 · 2020-06-07T10:47:28.000Z

It could be great to be able to tag the launch templates too with the option to propagate these tags to instances and volumes or not.

Is there some kind of best practice on tagging ASG vs tagging LT? It seems to me that tagging LT offers more flexibility (like the ability to tag the volumes).

Answer 15 · 2020-06-09T10:37:25.000Z

https://docs.aws.amazon.com/autoscaling/ec2/userguide/autoscaling-tagging.html touches upon the overlap in tag propagation between ASGs and Launch Templates.

Answer 16 · 2020-06-09T10:58:05.000Z

https://docs.aws.amazon.com/autoscaling/ec2/userguide/autoscaling-tagging.html touches upon the overlap in tag propagation between ASGs and Launch Templates.

That's precisely the documentation page I had in mind when asking about best practices ;-) This page explains the overlap but there are no clear pros and cons of the two tagging approaches. But it seems to me that LT offers more flexibility and that ASG tags should be used only when necessary (like for the cluster autoscaler discovery tags).

Answer 17 · 2020-06-13T10:21:16.000Z

There's a related discussion about tagging ASGs and LTs for non-managed Nodegroups at eksctl-io/eksctl#1603. My understanding from there is that tagging LTs and enabling propagation would be sufficient, but there might be use-cases where the ASG needs to have the tag too, but it wouldn't then needed to also support propagation.

The difference observed in that ticket is that the ASG propagation applies the tags after launch, while LT propagation applies the tags as part of the launch.

Answer 18 · 2020-06-15T08:23:04.000Z

Yes, I create my non-managed node groups using Terraform and put the tags on the LT with propagation to instances and volumes. The only tags I needed to put on ASG are the cluster autoscaler related tags. But propagation is not needed for these tags.

Answer 19 · 2020-06-16T10:34:36.000Z

need this feature too, will impact calculate costs if I add the tags manually later in ASG.

Answer 20 · 2020-07-21T05:30:44.000Z

We have EKS deployed as a new part of our stacks in prod through preprod, stage and dev (alongside a very large ECS deployment in each environment). It is very annoying that the instances are not tagged for cost allocation.

Answer 21 · 2020-07-22T08:17:35.000Z

+1 cost calcs are reaaaly important

Answer 22 · 2020-08-07T18:05:08.000Z

I would also like to see custom names or name prefixes for the autoscaling groups. The auto-generated uuid naming really slows down management of larger clusters.

Answer 23 · 2020-08-17T20:37:48.000Z

With managed node groups support for launch templates, you can now add tags to the EC2 instances created as part of your node groups. See EKS docs for details.

I will leave this issue open for a little while, as I want to get some more feedback. The issue as originally opened asks for tags on ASGs, but I suspect most of you ultimately care about tags on EC2 instances, not the ASGs. Please leave any comments if you still have a need for tags on the ASGs themselves. Our vision is we handle any of these ASG tags for you, for example when we implement scale to 0 #724, we'll automatically add the required tags to the ASG.

Answer 24 · 2020-08-17T20:42:37.000Z

I will leave this issue open for a little while, as I want to get some more feedback. The issue as originally opened asks for tags on ASGs, but I suspect most of you ultimately care about tags on EC2 instances, not the ASGs.

As the original issue creator, I can confirm that being able to tag the underlying EC2 instances was indeed the intent of the original ask. Though others may have had other reasons for wanting ASGs to be tagged.

Thank you to the EKS team for implementing this!

Answer 25 · 2020-08-18T00:49:44.000Z

Tags on the ASG are crucial if the ASG scales to zero. The cluster autoscaler for example will use the ASG tags if they exist. Without a way to propagate tags to the ASGs, we either have to run with unnecessary hosts or we have to bootstrap the ASGs directly.

Answer 26 · 2020-08-18T17:29:20.000Z

@dindurthy As I mentioned above, "Our vision is we handle any of these ASG tags for you, for example when we implement scale to 0 #724, we'll automatically add the required tags to the ASG."

Answer 27 · 2020-09-04T08:04:53.000Z

@mikestef9 While it is an admirable goal to handle the ASG tags automatically, it seems unlikely you will be able to do it quickly or easily. There are tags for node labels, node taints, and node resources, and it is unlikely EKS will be aware of all of them because of the various ways they can be created. At the moment it appears I cannot even get tags to propagate from the launch template to elastic GPUs (it seems EKS makes a copy of the launch template rather than use it directly, and the copy disables the "tag elastic graphics" setting), which makes me wary of trusting automatic behavior. I would rather you implement direct ASG tagging (or at least copying Launch Template tags to the ASG) first and see about automation later.

Answer 28 · 2020-10-05T15:16:22.000Z

Did the tags used to work? I thought they did but all my nodeGroups now no longer have the tags specified in the eksctl configuration. Need to get these tags back in as they are used for Cost reporting.

Answer 29 · 2020-10-09T19:10:13.000Z

Docs says the feature is there now but it does not work. Is it a tf 0.13.x feature as I am still on 0.12.x (don't want to move to 0.13 yet)? Would be nice the EC2 worker instances had a meaningful name, rather than just a hyphen.

tags = {
    "Name" = "eks-${var.cluster-name}-1"
  }

Answer 30 · 2020-10-23T12:47:57.000Z

I really want for tags on ASGs.

Because I attach target group to my ASG ( we don't use type: loadbalancer / use nodeport) . Since we have multiple ASGs for different purposes, we need to be able to identify them and attach the Target Group.

Answer 31 · 2020-11-12T03:57:58.000Z

really need to pass tags of node-group to ec2 & volumes, otherwise we have to query the instances of eks node-group and tagged it out of automatical process

Answer 32 · 2020-12-30T14:11:39.000Z

With managed node groups support for launch templates, you can now add tags to the EC2 instances created as part of your node groups. See EKS docs for details.

I will leave this issue open for a little while, as I want to get some more feedback. The issue as originally opened asks for tags on ASGs, but I suspect most of you ultimately care about tags on EC2 instances, not the ASGs. Please leave any comments if you still have a need for tags on the ASGs themselves. Our vision is we handle any of these ASG tags for you, for example when we implement scale to 0 #724, we'll automatically add the required tags to the ASG.

Tags can be propagated for EC2 instances but lets say if i need my EC2 instances to be tagged as node-01, node-02, node-03..... which is not happening as the ASG is the one which triggers to launch the Nodes not the Launch template. This is something very important.

Answer 33 · 2021-01-05T09:28:26.000Z

I need to tag the Autoscaling Group itself, not EC2.
I want to monitor the desired capacity of the AutoscalingGroup in Datadog, and I need to be able to set arbitrary tags on the AutoscalingGroup itself in order to be able to use it comfortably.

Autoscaling Groups created by Managed NodeGroups do not output metrics to CloudWatch, which is another issue, but tagging is still important

Answer 34 · 2021-01-05T12:52:46.000Z

It's kind of frustrating not to be able to tag our node group instances programmatically. In my case, I'm using terraform and already tried the tags and additional_tags, and neither one propagated the tags to ASG or instances itself. Our main goal with this tags is the cost allocation, so it would be extremely helpful.

Answer 35 · 2021-02-16T05:06:27.000Z

Please leave any comments if you still have a need for tags on the ASGs themselves.

+1

Answer 36 · 2021-02-16T11:07:35.000Z

Is @ravvereddy's use-case (of having per-node tags generated by ASG) actually supported?

I don't see anything in the docs hinting that there is some kind of templating for tags propagated from ASGs, so it seems feature-wise that ASG tagging doesn't bring anything more for node tagging than Launch Templates do.

I think it'd be particularly helpful to know if there are any use-cases for ASG tags propagating to nodes that aren't covered by Launch Template instance tags. I'd assume the latter can cover cost-allocation tracking or metric identification for EC2 instances, for example.

If not, then this question becomes simpler as then we have a clear "best practice" for tagging EC2 instances (Launch Templates, which already works), and this ticket can focus on the remaining needs for ASG-specific tags.

Tagging of ASGs themselves is still needed for Cluster Autoscaler scale-to-zero (#724 should cover the specifics of that use-case, I hope, as they do not require instance propagation) and resource ownership identification on accounts shared between teams, which is the use-case I've had in the past. My studio has graduated to multiple accounts under AWS Organizations, so that use-case has fallen off my radar now.

Answer 37 · 2021-03-05T00:35:41.000Z

Tagging of the ASG themselves is handy for some other stuff we want to run e.g https://github.com/AutoSpotting/AutoSpotting requires a tag on the ASG for it to do it's thing.

Answer 38 · 2021-03-14T21:50:58.000Z

I have created a custom resource which tags the ASG and propagates to EC2 instances. Our cluster was created as below:

 ### EKS control plane ###
  Cluster:
    Type: AWS::EKS::Cluster
    Properties:
      Name: !Sub ${EKSClusterName}-${Environment}
      Version: !Sub ${KubernetesVersion}
      RoleArn: !GetAtt  ClusterRole.Arn
      ResourcesVpcConfig:
        SecurityGroupIds:
          - !Ref ClusterControlPlaneSecurityGroup
        SubnetIds:
          - Fn::ImportValue: !Sub ${VpcStackName}-${Environment}-private-a
          - Fn::ImportValue: !Sub ${VpcStackName}-${Environment}-private-b
          - Fn::ImportValue: !Sub ${VpcStackName}-${Environment}-private-c

The node group was created like this:

### Create EKS managed node group ###
  Nodegroup:
    DependsOn: Cluster
    Type: 'AWS::EKS::Nodegroup'
    Properties:
      NodegroupName: !Sub ${EKSClusterName}-node-${Environment}
      ClusterName: !Ref Cluster
      InstanceTypes: 
        - !Ref NodeInstanceType
      DiskSize: !Ref NodeVolumeSize
      RemoteAccess:
        Ec2SshKey: !Sub ${EKSClusterName}-${Environment}
        SourceSecurityGroups: 
          - !Ref NodeSecurityGroup
      NodeRole: !GetAtt NodeInstanceRole.Arn
      ScalingConfig:
        MinSize: !Ref NodeGroupMinSize
        MaxSize: !Ref NodeGroupMaxSize
        DesiredSize: !If [IsNotProd, 1, !Ref NodeGroupDesiredCapacity]
      Labels:
        type: !Ref Environment
      Subnets:
        - Fn::ImportValue: !Sub ${VpcStackName}-${Environment}-private-a
        - Fn::ImportValue: !Sub ${VpcStackName}-${Environment}-private-b
        - Fn::ImportValue: !Sub ${VpcStackName}-${Environment}-private-c

Then we tag the ASG with the custom resource (the tag name is "Name" and our tag value is the cluster name):

  ## Tag resources ###
  AsgTaggingRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Path: "/"
      Policies:
      - PolicyName: lambda-logging
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - logs:CreateLogGroup
            - logs:CreateLogStream
            - logs:PutLogEvents
            Resource: arn:aws:logs:*:*:*
      - PolicyName: lambda-tagging
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Effect: Allow
            Action:
            - autoscaling:CreateOrUpdateTags
            Resource:
              - '*'
          - Effect: Allow
            Action:
            - eks:DescribeNodegroup
            Resource: '*'

  AsgTagging:
    Type: Custom::AsgTagging
    Properties:
      ServiceToken: !GetAtt AsgTaggingFunction.Arn
      AsgId: !GetAtt Nodegroup.NodegroupName # Get the node group name

  AsgTaggingFunction:
    Type: AWS::Lambda::Function
    Properties:
      Runtime: python3.7
      Handler: index.lambda_handler
      MemorySize: 128
      Role: !GetAtt AsgTaggingRole.Arn
      Timeout: 120
      Environment:
        Variables:
          TAG_KEY: Name
          TAG_VALUE: !Ref Cluster # Get the EKS cluster name
          EKS_CLUSTER: !Ref Cluster # Get the EKS cluster name
          NODE_GROUP: !GetAtt Nodegroup.NodegroupName # Get the node group name
      Code:
        ZipFile: |
          import boto3
          from botocore.exceptions import ClientError
          import os
          import cfnresponse 

          def lambda_handler(event, context):
            print("Event :", event)
            data = {}
            tag_key = os.getenv('TAG_KEY')
            tag_value = os.getenv('TAG_VALUE')
            eks_cluster = os.getenv('EKS_CLUSTER')
            node_group = os.getenv('NODE_GROUP')

            try:
              eks = boto3.client('eks')
              # Retrieve autoscaling group name
              asg = eks.describe_nodegroup(clusterName=eks_cluster, nodegroupName=node_group)['nodegroup']['resources']['autoScalingGroups'][0]['name']
            except Exception as e:
              print(e)
            
            try:
              client = boto3.client('autoscaling')
              if event['RequestType'] == 'Create':
                res = client.create_or_update_tags(
                    Tags=[
                        {
                            'Key': tag_key,
                            'PropagateAtLaunch': True,
                            'ResourceId': asg,
                            'ResourceType': 'auto-scaling-group',
                            'Value': tag_value,
                        }
                    ],
                )
                data["Reason"] = "The ASG " + asg + " has been tagged."
                cfnresponse.send(event, context, cfnresponse.SUCCESS, data)
              elif event['RequestType'] == 'Update':
                res = client.create_or_update_tags(
                    Tags=[
                        {
                            'Key': tag_key,
                            'PropagateAtLaunch': True,
                            'ResourceId': asg,
                            'ResourceType': 'auto-scaling-group',
                            'Value': tag_value,
                        }
                    ],
                )
                data["Reason"] = "The ASG " + asg + " has been tagged."
                cfnresponse.send(event, context, cfnresponse.SUCCESS, data)
              elif event['RequestType'] == 'Delete':
                    data["Reason"] = "Resource deleted"
                    cfnresponse.send(event, context, cfnresponse.SUCCESS, data)
              else:
                data["Reason"] = "Unknown operation: " + event['RequestType']
                cfnresponse.send(event, context, cfnresponse.FAILED, data, "")
            
            except Exception as e:
              data["Reason"] = "Cannot " + event['RequestType'] + " Resource: " + str(e)
              cfnresponse.send(event, context, cfnresponse.FAILED, data, "")

I hope this can help.

Answer 39 · 2021-04-22T21:42:29.000Z

Anything missing mandatory tags is considered non-complaint in my organization. EKS ASGs got deleted when compliance scan kicks in. We really need to have the tags propagated from the managed node group to these ASGs.

Answer 40 · 2021-05-24T19:48:11.000Z

Same here, using the launch template to propagate tags to EC2 instances are not enough, we also need to tag the ASG itself to be compliant to our organization's policy, otherwise it will be scaled down to 0.

Anything missing mandatory tags is considered non-complaint in my organization. EKS ASGs got deleted when compliance scan kicks in. We really need to have the tags propagated from the managed node group to these ASGs.

Answer 41 · 2021-05-24T20:17:31.000Z

Why is this issue controversial? Many company need tag on ASG for cost and compliance reason.

Answer 42 · 2021-05-24T23:55:22.000Z

I'm not sure what you're seeing as controversial in this ticket?

The originally-described use-case for this feature has been resolved elsewhere, per #608 (comment)
ASG tags for use by cluster autoscaler scale-to-zero is being resolved in #724 by adding support to CA to not depend on the ASG tags
Other uses of ASG tags have been described in this ticket, as requested in #608 (comment)

I don't see anyone saying that this should not happen, or otherwise introducing controversy?

Answer 43 · 2021-05-25T02:20:43.000Z

EKS ver == 1.20
As a workaround, have to launch a nodegroup, then custom launch template with resource tags from the nodegroup template, then delete existing nodegroup, and re-run new nodegroup with customize template to apply resource tags.

Answer 44 · 2021-06-16T03:53:35.000Z

Why this issue is controversial I do not know.
AWS has always had AutoscalingGroups, and has always had resource tags. We just want to take advantage of it. AWS don't need to develop anything additional.
Why can't managed services take advantage of these features?
Am I making a strange request?

The AutoscalingGroup name automatically generated by Managed Node Groups is indistinguishable to humans. Without resource tags, you won't be able to comfortably tell them apart.

Additional contexts:
#608 (comment)

Answer 45 · 2021-06-29T07:49:58.000Z

Trying to manage SPOT/OD node groups as managed node groups.
To be able to scale from zero in such scenario I need to tag the ASG's according to this doc.
Without the option to add custom tags, I'm unable to make this work with managed node groups.

Answer 46 · 2021-06-29T09:45:00.000Z

Scale-from-zero with Cluster Autoscaler is #724.

Answer 47 · 2021-07-02T15:15:08.000Z

@TBBle this is related to the Scale-from-zero with Cluster Autoscaler #724 but acording to the docs is needed more urgently to have cluster autoscaler work correctly with labelled and tainted nodes.

In my mind there are three ways to solve this.

Support copying all managed node group tags to the ASG
Support copying all tags with a specific prefix or prefixes to the ASG
Automate creating the k8s.io/cluster-autoscaler/node-template/label/ & k8s.io/cluster-autoscaler/node-template/taint/ tags on the ASG

Option 1 would be the status quo to un-managed node groups, option 2 would limit the scope of the tags and option 3 would actually make managed node groups a better solution than their un-managed counterparts.

Answer 48 · 2021-07-02T15:39:51.000Z

The long-term solution chosen by AWS is none of those (instead, Cluster Autoscaler reads Managed Nodegroups metadata directly to learn the labels and taints), but in #724 you'll see examples and workarounds implementing your approaches, and that would be the place to make your case that scale-from-zero can't wait for the implementation of the CA feature, but should be handled by some kind of ASG tag automation as you have described.

Answer 49 · 2021-08-26T11:29:12.000Z

@otterley While Managed Nodegroup doesn't support customer provided tags for ASGs today, we do add the necessary tags for CAS auto discovery to the ASG i.e. k8s.io/cluster-autoscaler/enabled and k8s.io/cluster-autoscaler/<CLUSTER NAME>.

@rtripat I wouldn't say that all necessary tags are added…

I'm trying to use autoscaler in architecture-mixed EKS (ARM + x86) and it just doesn't works because I have GitLab Runner running on ARM node which spins x86 Pod. Autoscaler is totally unaware of nodeSelector kubernetes.io/arch=amd64 I set for GitLab Runner and can't scale up from 0-node x86 node groups.

I followed docs, added tags to terrafrom (k8s.io/cluster-autoscaler/node-template/label/kubernetes.io/arch=amd64 to be specific), they were added to nodegroups and… well… doesn't works because ASG didn't get those tags. Adding them to ASG manually makes autoscaler works properly in this scenario.

But handling OS and Arch should be fully automatic. C'mon, EKS management costs A LOT and it's unable to export basic Kubernetes labels… :/

Answer 50 · 2021-08-26T15:10:45.000Z

As noted in #724, hashicorp/terraform-provider-aws#20674 is pending release which will allow you to add the tags to the ASG that's implicitly created by the managed node group (assuming you're using terraform to create those node pools or are able to otherwise find out the ASG name).

It's a lot more work than if it would happen automatically, but it's at least possible now.

Answer 51 · 2021-08-26T16:55:58.000Z

@daenney will EKS terraform module be also updated to fix this? https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/latest

Answer 52 · 2021-08-26T17:00:36.000Z

@morsik No idea, I don't use that module and am not one of the maintainers. I'd suggest raising that on their issue tracker: https://github.com/terraform-aws-modules/terraform-aws-eks/issues.

Answer 53 · 2021-09-17T08:25:29.000Z

Passing the managed node tags to launch templates "Instance tags" will automatically apply to both EC2 and its volumes. If there are some challenges to do that, creating a separate "Custom Tags" section in the EKS managed node configuration page will also be helpful.

Can you please help identify where the "Custom Tags" section exists?

Answer 54 · 2021-09-17T12:01:58.000Z

You haven't mentioned what tool you're using, if any, but at the EC2 API level, that was probably referring to TagSpecifications in the RequestLaunchTemplateData object used with the CreateLaunchTemplate and CreateLaunchTemplateVersion APIs.

That's what's used for terraform-provider-aws's launch template implementation, and eksctl's managed and unmanaged node group implementations.

Answer 55 · 2021-10-18T10:31:47.000Z

The issue with aws_autoscaling_group_tag is, in order to be able to loop over the ASGs they have to exist. So one can't write modules to create node groups and tag the auto scaling groups in one go. It will always require a --target apply due to a Terraform limitation.

The only viable approach is to:

Have the node group resource propagate the tags
Have the CA read the tags from node group instead of the auto scaling group

I understand from the thread that option 2 is preferred. But I can't find any PR or issue regarding this in the CA. Is AWS working on this?

Answer 56 · 2021-10-18T10:38:31.000Z

@pst I've asked a similar question in #724.

Answer 57 · 2021-10-23T00:07:46.000Z

Why is this issue controversial? Many company need tag on ASG for cost and compliance reason.

Same here. ASG tagging is mandatory in my company for cost and compliance.

Answer 58 · 2021-10-24T09:43:17.000Z

I find it useful to use the launchtemplate configuration for the terraform eks module.
This way I'm able to tag the underlying instances. Tagging the instances in my case is more valuable that tagging the ASG it self.

See TF example here.

Answer 59 · 2021-10-25T02:44:55.000Z

@newb1e

I'll say it again: I don't want to monitor EC2, I want to monitor the AutoscalingGroup itself.
#608 (comment)

Answer 60 · 2021-10-25T15:51:07.000Z

Is this even still being looked at and worked on? For us ASG tagging is a compliance issue. I know there are workarounds using lambda and such, however our enterprise has a ton of accounts and having this be the default behavior of nodegroups would make our lives so much simpler. Doesn't seem like a big ask.

Answer 61 · 2021-10-25T16:02:46.000Z

It's almost two years for just setting a simple tag... which turns out to be critical for compliance, security, scaling node groups...

Answer 62 · 2021-10-26T10:14:00.000Z

I don't think it's even been "worked on", it's still in the 'Researching' stage, and probably lost a lot of momentum when it was thought that the value was only for cluster-autoscaler, or for tags on ASGs only to propagate to nodes, both of which are being addressed elsewhere. Or at least that's my impression from #608 (comment).

Disappointingly, since @mikestef9 asked in that comment a year ago for more ASG-tagging use-cases, we haven't heard any follow-up (or even acknowledgement) on the various use-cases that have been shared here.

Answer 63 · 2021-10-26T10:21:49.000Z

@TBBle there are a lot of ASG tagging use cases in this thread before the @mikestef9 comment that launch template tagging wont address, there are also a lot of the ASG tagging use cases called out in #724 when discussing the tags for CA.

Answer 64 · 2021-10-26T10:44:24.000Z

Looking back, the only comment I can see that requested something other than node-tagging via tag propagation, or Cluster Autoscaler tags for scale-from-zero, was #608 (comment) 10 days before @mikestef9's comment, and well after the development work that delivered Managed Node Group Launch Template support would have happened to address the needs called out in this ticket.

A quick skim doesn't show anything in #724 up to that date that isn't specifically about CA scale-from-zero, and if there is such a thing, overlooking it in the context of the rest of the discussion would be pretty easy.

Even the original feature request of this ticket was tagging ASGs for propagation to nodes and volumes.

Answer 65 · 2021-10-26T10:56:20.000Z

@TBBle I was sure I'd read a number of requests about compliance policies requiring ASGs to be tagged, but both of these issue threads are so long it's hard to parse the data out of them, but that'd be the one I'd be putting forward.

I'm still of the opinion that something as trivial as cascading tags from MNGs to ASGs should have been implemented as part of the MVP. If there is a better way to handle the same use cases it needs to actually exist and then it can be adopted by virtue and not as the only solution (or no solution).

Answer 66 · 2021-10-26T12:16:46.000Z

The threads are long, but the vast majority of activity was after the comment from @mikestef9 that we were talking about, in August 2020. It's about 20 comments in, compared to the fortyish that have come after that. And #724 spent most of its time before August 2020 talking about actually setting minSize to 0 (and directly modifying the ASG to work around this); the tagging discussion only came up there once, in July 2020, until people were referred to there from here by that comment.

Anyway, looking at "should have been" isn't very valuable here. This is containers-roadmap, not containers-if-we-could-turn-back-time. What's interesting is what is going to be done; currently, that's an empty space, and the rest of the discussion just makes it more likely that the actual use-cases will be lost in the noise and nothing will continue to happen. -_-

(If I could turn back time on this ticket, I'd have opened a different ticket for tagging ASGs for their own sake back when I subscribed to this ticket; at the time I didn't realise that the node-group-tag-propagation part of the request was going to be the need that was fulfilled. Similarly I regret not leaving a comment for my at-the-time use-case way back then too)

Answer 67 · 2021-10-26T17:33:16.000Z

@TBBle @stevehipwell I created a separate issue to track the request for ASG tagging (for compliance and cost reasons). This is quite important for many organizations to adopt EKS.

Answer 68 · 2021-11-18T10:41:21.000Z

the EBS CSI controller needs the tag topology.ebs.csi.aws.com/zone. The PVs created with nodeAffinity and expects the nodes to have topology.ebs.csi.aws.com/zone, when the ASGs have 0 instances, cluster autoscaler needs ASGs to have those tags..
nodegroups manage ASGs, nodegroups should be able to tag ASGs

Answer 69 · 2021-11-19T05:34:16.000Z

aws-node-termination-handler Queue mode needs ASGs to be tagged with Key=aws-node-termination-handler/managed. Ability to propagate tags from MNG to ASG will make it easier for users.

Answer 70 · 2021-11-19T07:18:58.000Z

@nkk1 There's a proposal/discussion for Cluster Autoscaler to assume/populate topology.ebs.csi.aws.com/zone on the scale-from-zero template using the AWS backend. Since the label's auto-added by the CSI Controller, I think that'll be less surprising for the user, and frankly will probably land sooner than ASG user-defined tagging support.

If CA doesn't accept that proposal, then user ASG tagging be the only way to make scale-from-zero work correctly with Managed Node Groups and the EBS CSI Controller, since the effort to migrate away from that label seems to have failed.

Answer 71 · 2021-11-19T08:16:51.000Z

@askulkarni2 I think the guidance is to not use NTH with MNGs, @bwagner5 can probably shed a bit more light here.

Answer 72 · 2021-11-19T15:37:03.000Z

@stevehipwell @askulkarni2 That is correct. NTH is not needed when using managed node groups for handling terminations. MNG already gracefully handle terminations due to ASG scale-down, spot interruptions, and capacity rebalance.

Answer 73 · 2021-11-19T17:15:13.000Z

@bwagner5 fantastic! Thanks for the insight. And thanks @stevehipwell for bringing it to attention.

Answer 74 · 2022-03-14T12:48:00.000Z

That's really crucial for several reasons, following are some of the issues we're experiencing due to this limitation.

Cost analysis issues due to missing tags on ASG and its EC2
Running dedicated node groups for specific workloads requires a couple of cluster-autoscaler tags for properly scaling node groups based on labels and taints.

So I really hope there will be a proper solution instead of some workarounds.

Answer 75 · 2022-04-07T11:36:50.000Z

This is a really important feature!

Let's say we need to tag the ASG to make the cluster auto scaler work like a charm, check tracking cost, resource ownership, ....

Using AWS managed node groups:

we can set labels and taints and not the ASG tags, in terraform this is easily solved using something like this:

resource "aws_autoscaling_group_tag" "tag_cpu_ng" {
  autoscaling_group_name = aws_eks_node_group.cpu_ng.resources[0].autoscaling_groups[0].name

  tag {
    key                 = "k8s.io/cluster-autoscaler/node-template/taint/X"
    value               = "NoSchedule"
    propagate_at_launch = true
  }
}

And using CloudFormation?

Using self-managed node groups:

we can create the ASG and set the tags, but how can we set the taints and labels? Using the bootstrap.sh?

Answer 76 · 2022-04-25T09:29:48.000Z

👋 ℹ️ Managed ASG tagging is now implemented in eksctl with eksctl-io/eksctl#5002 and should land in the next release.

Answer 77 · 2022-04-27T05:25:38.000Z

I am waiting for this feature to be natively supported by AWS.
While eksctl is an excellent solution, it is not suitable for use in the context of Infrastructure as a Code and needs continued support to be available in solutions such as CloudFormation and Terraform.

Answer 78 · 2022-04-28T22:32:46.000Z

@andre-lx 's answer is spot on. Using the "aws_autoscaling_group_tag" resource worked for me. But it only worked for new nodes. So I just cycled out my existing nodes one-by-one, and the new nodes were all tagged as they should. For instance, this is my set up for creating a node group, and creating a aws_autoscaling_group_tag that sets the "Name" tag which shows up in ec2.

resource "aws_eks_node_group" "nodes_group" {
  cluster_name    = aws_eks_cluster.eks_cluster.name
  node_role_arn   = aws_iam_role.eks_assume_role.arn
  subnet_ids      = var.subnet_ids
  ###########
  # Optional
  ami_type        = "AL2_x86_64"
  disk_size       = 60
  instance_types  = ["m6i.xlarge"]
  node_group_name = "worker"
  version         = var.kubenetes_version

  scaling_config {
    desired_size = 2
    max_size     = 4
    min_size     = 1
  }

  update_config {
    max_unavailable = 2
  }

  # Ensure that IAM Role permissions are created before and deleted after EKS Node Group handling.
  # Otherwise, EKS will not be able to properly delete EC2 Instances and Elastic Network Interfaces.
  depends_on = [
    aws_iam_role_policy_attachment.EKS-AmazonEKSWorkerNodePolicy,
    aws_iam_role_policy_attachment.EKS-AmazonEKS_CNI_Policy,
    aws_iam_role_policy_attachment.EKS-AmazonEC2ContainerRegistryReadOnly,
  ]
}

#EKS can't directly set the "Name" tag, so we use the autoscaling_group_tag resource. 
resource "aws_autoscaling_group_tag" "nodes_group" {
  for_each = toset(
    [for asg in flatten(
      [for resources in aws_eks_node_group.nodes_group.resources : resources.autoscaling_groups]
    ) : asg.name]
  )

  autoscaling_group_name = each.value

  tag {
    key   = "Name"
    value = "eks_node_group"
    propagate_at_launch = true
  }
}

In all honesty, Terraform should at least explicitly state in the documentation that their Tag doesn't work for setting the "Name" tag, as that is a key tag that lots of companies use to organize instances and manage billing. Personally I think not having the tags parameter override the "Name" tag is a bug. But... but I'd at least settle for better documentation that describes this work around.

Can you at least update the documentation so other people don't have to waste as much time on this?

Pretty please?

Answer 79 · 2022-05-04T22:31:04.000Z

Guys, the resource responsible for EC2 tags is the "Resource Tags" of the Launch Template. I have the same problem and i have verified that its not possible to add tags in the Launch Template automatically generated by Terraform. To create the Tags for this resource, we need to provision our own Launch Template.

However, I managed to at least Tag EC2 with the "Default tags" from my repository. Here is the code used:


data "aws_default_tags" "default_tags" {}

# Add 1 Tag for the 1 or more node groups
resource "aws_autoscaling_group_tag" "tag_aws_node_termination_handler" {
  for_each = toset([
    aws_eks_node_group.node_group_name1.resources[0].autoscaling_groups[0].name,
    aws_eks_node_group.node_group_name2.resources[0].autoscaling_groups[0].name
  ])

  autoscaling_group_name = each.value
  
  tag {
    key   = "aws-node-termination-handler/managed"
    value = "PropagateAtLaunch=true"

    propagate_at_launch = true
  }
}

# Add tags for 1 node group
resource "aws_autoscaling_group_tag" "default_tags_asg_high" {
  count = length(keys(data.aws_default_tags.default_tags.tags))
  autoscaling_group_name = aws_eks_node_group.node_group_name.resources[0].autoscaling_groups[0].name

  tag {
    key                 = keys(data.aws_default_tags.default_tags.tags)[count.index]
    value               = values(data.aws_default_tags.default_tags.tags)[count.index]
    propagate_at_launch = true
  }
}

Answer 80 · 2022-05-15T23:42:45.000Z

I just came here to say this issue has been open for 900 days and is the 3rd most 👍'd in the project.

Answer 81 · 2022-05-23T15:30:37.000Z

Just found this, again in conversation with our AWS AM. kind of a dup of #724

As @SlevinWasAlreadyTaken mentions now available in eksctl thanks to his hard work in eksctl-io/eksctl#5002

Some bash to workaround #724 (comment)
Some terraform in that comment chain too.

Edit: I don't work for AWS, complain to your account manager. I wholeheartedly agree this is ridiculous.

Answer 82 · 2022-06-28T10:40:24.000Z

Hey guys.

Here at YData we built a solution to solve this issue till the official release. We created an AWS lambda that can tag the EKS Node groups ASG with common tags or specific tags per node group using CloudFormation.

This can be used to add tags for labels and taints (cluster-autoscaler) or any other tags that can help for check tracking cost, resource ownership, ...

Fell free to test and give us your review.

https://github.com/ydataai/aws-asg-tags-lambda

You can use it directly in the template as:

ASGTagLambdaFunction:
  Type: AWS::Lambda::Function
  Properties:
    Role: !GetAtt ASGTagLambdaExecutionRole.Arn
    PackageType: Image
    Code:
      ImageUri: !Ref EcrImageUri
    Architectures:
    - x86_64
    MemorySize: 1024
    Timeout: 300

ASGTagLambdaInvoke:
  Type: AWS::CloudFormation::CustomResource
  DependsOn: ASGTagLambdaFunction
  Version: "1.0"
  Properties:
    ServiceToken: !GetAtt ASGTagLambdaFunction.Arn
    StackID: !Ref AWS::StackId
    AccountID: !Ref AWS::AccountId
    Region: !Ref AWS::Region
    ClusterName: "the EKS cluster name" #!Ref EKSCluster
    CommonTags:
    - Name: "ENVIRONMENT"
      Value: "dev"
      PropagateAtLaunch: true
    NodePools:
    - Name: "system-nodepool" #!GetAtt YourNodeGroup.NodegroupName
      Tags:
      - Name: 'k8s.io/cluster-autoscaler/node-template/taint/TAINT'
        Value: 'NoSchedule'
        PropagateAtLaunch: true
      - Name: 'k8s.io/cluster-autoscaler/node-template/label/LABEL'
        Value: 'LABEL_VALUE'
        PropagateAtLaunch: true
    - Name: "another-pool"

Answer 83 · 2022-06-30T00:42:20.000Z

Here at YData we built a solution to solve this issue till the official release.

Each of us already has such a solution.
We do not want to do that in the future, so we are requesting official support for such a solution.

#608 (comment)

AWS has always had AutoscalingGroups, and has always had resource tags. We just want to take advantage of it. AWS don't need to develop anything additional.
Why can't managed services take advantage of these features?
Am I making a strange request?

Answer 84 · 2022-08-19T19:30:20.000Z

I wanted to share a recent launch from the EKS team that might be of interest to folks following this issue. Earlier this week we released Cluster-level Cost Allocation Tagging:

With this launch, all EC2 instances which join an EKS cluster are automatically tagged with an AWS-generated cost allocation tag [containing the EKS cluster name]. Any EC2 instance used in an EKS cluster will be tagged automatically without any additional action required, regardless of whether they are provisioned using EKS managed node groups, Karpenter, or directly via EC2. This tag can be used to allocate EC2 costs to individual EKS clusters through AWS Billing and Cost Management tools...

While this feature won't help to propagate customer-defined tags down to the EC2 instances in an EKS cluster, for those of you who are looking for better cost allocation across multiple EKS clusters, this feature will reduce the work required.

Answer 85 · 2022-08-26T00:59:11.000Z

While this feature won't help to propagate customer-defined tags down to the EC2 instances in an EKS cluster, for those of you who are looking for better cost allocation across multiple EKS clusters, this feature will reduce the work required.

Thanks for sharing this update, I am sure some will find it useful. I don't want to shoot the messenger and I know you are just trying to help but this really is attacking the problem from the wrong end is it not? 🤷‍♂️

EC2 instances that are not a part of a managed node group are already easily taggable with customer-defined tags and managed node groups already had this tag that would be passed through for cost-allocation. From my perspective this change does little to reduce work required if you want to use managed node groups.

I only know this because of having to write code for a previous employer to process this tag to retrieve the customer defined tag values so costs could be allocated in the same way as everything else. Luckily we had well structured cluster names that included the values required. However, it is brittle and the processing broke a few times when cluster name structure was changed for operational reasons (eg to add blue/green support to our automation for EKS upgrades).

Please consider following through on this issue (getting close to 3 years now). We always get told by our account managers, TAMs and SAs to use tags, it would be nice if tagging actually worked for cost allocation for all resources, EKS or otherwise. Thanks.

Answer 86 · 2022-10-03T21:05:05.000Z

Can someone please help us understand why this is not getting any traction with this much attention? It appears to be still in the "researching" phase. Just ran into this when trying to scaling from 0 on managed nodes groups using the terraform eks module.

Answer 87 · 2022-10-03T21:25:54.000Z

Honestly you should all move to Karpenter.

Answer 88 · 2022-10-03T21:32:47.000Z

Another vote for karpenter

Answer 89 · 2022-10-03T22:39:42.000Z

Honestly you should all move to Karpenter.

Honestly that has to be one of the least helpful suggestions I have seen in while. We are talking about Managed Nodes which are not going to magically get tagged if you install Karpenter in the cluster and start launching nodes (you also need running nodes for Karpenter's controller to run on, bit of chicken and egg problem). Karpenter is a nice tool for sure but it is not a solution to this issue.

Answer 90 · 2022-10-04T00:11:27.000Z

you also need running nodes for Karpenter's controller to run on

Customers can run Karpenter on Fargate (managed compute). This helps eliminate the bootstrapping problem. However, resource tagging is not yet available for Fargate on EKS. If Karpenter is the only thing running on Fargate, this might be acceptable for cost-allocation purposes.

Answer 91 · 2022-10-04T18:55:40.000Z

Karpenter has some nice features but is a bit more complex for probably most people's needs since it geared for clusters with larger workloads that can benefit from more advanced scheduling and continuous resource optimization. One thing I don't like is Karpenter's requirement to know specific node group info vs just using static tags with the cluster name. Thanks for the suggestion but for those that want to continue to use CA it would be nice to see some tags on the autoscaling groups to solve this.

Answer 92 · 2022-10-04T19:21:45.000Z

Going off topic here, but curious about this

One thing I don't like is Karpenter's requirement to know specific node group info vs just using static tags with the cluster name.

What do you mean?
Karpenter doesn't care about anything other than pending pods to be allocated.
Or you can go overboard and create very scoped provisioners per team or deployment label

Answer 93 · 2022-10-04T20:28:20.000Z

Going off topic here, but curious about this

One thing I don't like is Karpenter's requirement to know specific node group info vs just using static tags with the cluster name.

What do you mean? Karpenter doesn't care about anything other than pending pods to be allocated. Or you can go overboard and create very scoped provisioners per team or deployment label

I might be wrong but was just looking at their install documentation and noticed they wanted "--set aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} " in the helm chart but guessing maybe there is a way to just use irsa instead. I also see there is an application of a manifest after you run a helm chart which is a bit odd but these are just installation related issues, and off topic as you said.

Answer 94 · 2022-10-04T20:40:16.000Z

@flowinh2o I think aws.defaultInstanceProfile is the instance profile for the nodes being created so they can join the cluster.

Answer 95 · 2022-10-24T13:10:42.000Z

Running dedicated node groups for specific workloads requires a couple of cluster-autoscaler tags for properly scaling node groups based on labels and taints.

Exactly our problem also. Details here.

Answer 96 · 2023-01-25T11:49:52.000Z

This looks like such a must-have feature thats super simple to implement if you allow to just propagate node_group tags to asg...

Answer 97 · 2023-01-31T01:33:08.000Z

Any updates on this? Also needed here

Answer 98 · 2023-02-09T12:19:24.000Z

Also having a problem with this - I wouldn't expect in a service called Managed Node Groups that I would have to work around tag propagation issues. It seems vendors such as weaveworks have implemented their own workarounds, sadly there is no such workaround in terraform.

This issue is pretty fundamental - would like it fixing please :)

Answer 99 · 2023-02-09T15:00:49.000Z

It seems vendors such as weaveworks have implemented their own workarounds, sadly there is no such workaround in terraform.

@RogerWatkins-Anaplan it's easy enough to do this with Terraform and I think there are links in some of the comments above on how to do it. That said you wouldn't expect to need to do this for a first party vendor solution.

Answer 100 · 2023-03-13T03:45:04.000Z

Are the guys working on this? This is a must-have for us to implement the project