GitHub downloader subject to rate limits
alexzielenski opened this issue · 2 comments
For unsupported schemas we have the option of downloading them from GitHub. This requires 1 or 2 api calls to list Kubernetes source directories. GitHub responses support E-Tags in their responses so can be cached without affecting the rate limit. We should add an on-disk cache similar to kubectl that stores the etags.
Another improvement would be to use the GitHub api key if supplied in envar or argument. That way the test suite would not ever be subject to rate limiting.
I am interested in taking this on
We should add an on-disk cache similar to kubectl that stores the etags
To be clear, we want to cache the response data as well, not just the etag? Will take a look at how kubectl manages this as well.
/assign
To be clear, we want to cache the response data as well, not just the etag? Will take a look at how kubectl manages this as well.
That's correct. For caching discovery & openapi kubectl
uses a caching http.RoundTripper
: https://github.com/kubernetes/client-go/blob/master/discovery/cached/disk/round_tripper.go
It may be possible for us to just use a caching roundtripper directly for our GitHub downloader. kubectl uses the roundtripper wrapped inside a rest client:
Wraps rest client config here:
https://github.com/kubernetes/client-go/blob/cf830e3cb3abbcc32cc1b6bea4feb1a9a1881af3/discovery/cached/disk/cached_discovery.go#L301
Create caching rest client:
https://github.com/kubernetes/client-go/blob/cf830e3cb3abbcc32cc1b6bea4feb1a9a1881af3/discovery/discovery_client.go#L715
Gives to open api here:
https://github.com/kubernetes/client-go/blob/cf830e3cb3abbcc32cc1b6bea4feb1a9a1881af3/discovery/discovery_client.go#L656