imduffy15/k8s-gke-service-account-assigner

Cached certificate becomes invalid if service account is recreated

gliush opened this issue · 1 comments

Hi.

The assigner uses a ccache to store service account information:
https://github.com/postmates/k8s-gke-service-account-assigner/blob/master/iam/iam.go#L15

This cache may become invalid if the service account is recreated or a new one is created with the same name.

What makes it worse, the Fetch interface of the ccache library may return expired data:

// This can return an expired item. Use item.Expired() to see if the item
// is expired and item.TTL() to see how long until the item expires

It is a library feature, as the readme.md specifies:

By returning expired items, CCache lets you decide if you want to serve stale content or not. For example, you might decide to serve up slightly stale content (< 30 seconds old) while re-fetching newer data in the background. You might also decide to serve up infinitely stale content if you're unable to get new data from your source.

So, if I store something in the cache, and I don't have enough keys for the LRU to remove expired ones, I will keep the stale cache forever.

The solutions I see:

  1. The simplest one. To add an interface to invalidate any cached key. If I update the service account, I know what I am doing and I will take care of stale cache.
  2. The hardest as it changes the behavior slightly. Remove all expired keys, thus, every 50 min you will need to re-fetch all the service-accounts data from Google.
  3. Something in between. Reduce the ttl to 5 min and extend the cache if it is expired, but not yet rotted (another 5 min). If the cache is "rotted" (stored for 10 min), remove it. So, all the actual service accounts data will be kept, and all deleted service account data will be deleted.

I can implement the PR if you share your opinion about this issue and what path you'd choose.

I can implement the PR if you share your opinion about this issue and what path you'd choose.

Hi @gliush I'll leave this one up to you - feel free to do whatever you think is the best approach.

For the most part, this project should be killed in favor of Google Workload Identity https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity