aws/amazon-vpc-cni-k8s

RefreshSecurityGroups should only be called on ENIs already checked by the ENI/IP reconciler

joshjp-aws opened this issue · 8 comments

What would you like to be added:
I would like the current behavior to change. Currently the ENI/IP reconciler runs on a 60 second loop while the SG reconciler runs on a 30 second loop. Further both reconcilers pull information on the ENIs from IMDS.

Consider a scenario where an instance is launched and shortly after an ENI with the node.k8s.amazonaws.com/no_manage: true tag is added to the instance. If the SG reconciler runs before the ENI/IP reconciler it will modify the security groups as the ENI/IP reconciler hasn't had a chance to check the tags on the ENI yet.

Ideally the SG reconciler would only run against ENIs which are known and confirmed to not have this label attached.

One option here is to have the SG reconciler rely on an internal cache while the ENI/IP reconciler is responsible for pulling the ENIs from IMDS and determine if it is managed/unmanaged.

Why is this needed:
To prevent the SG reconciler from updating ENIs with the node.k8s.amazonaws.com/no_manage: true tag

One option here is to have the SG reconciler rely on an internal cache while the ENI/IP reconciler is responsible for pulling the ENIs from IMDS and determine if it is managed/unmanaged.

This is a good idea. We will evaluate this.

Thanks for looking at this @orsenthil. Will be on standby for the fix as this is affecting production workload in our environment.

Basically we don't need this -

allENIs, err := cache.GetAttachedENIs()
rather pull from datastore unmanaged ENI cache.. might it be tricky to do it and would need some refactoring..

This is addressed.

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.

@orsenthil is there a new release version for this fix? thanks

@sorjii - Not yet. We are planning to release 1.18.2 with this change and other security fixes which are in pipeline. The change is merged in master.

Thanks for the quick response, really appreciate it. :)