bagetter/BaGetter

Cache storage size & retention

Opened this issue · 5 comments

When using BaGetter as a mirror service it would be nice to have some control over how much data is used and perhaps a retention/eviction policy

What do you mean exactly? How much data BaGetter is allowed to download from the upstream? Or how much data gets downloaded from BaGetter?

I could imagine an optional setting for the read-through cache like "RemoveUnusedAfterDays" that, when set, somehow checks when a cached package was last requested and removes it from the cache after the specified time.
+ stable/prerelease differentiation?

Control over how much data is used might be difficult. For starters there is currently no way of seeing how much data is actually being used, unless we roll it ourselves and basically just load and sum up all packages.

Then if we have this information, what to do when we hit a set limit?

  • Delete oldest (least recently requested) package from cache (stable vs. prerelease)?
  • Do we ever delete packages from our own repository? This one is risky.
  • Passive reactions: Stop read-through caching more packages, block new uploads?

I don't know the internals, but maybe some system along the lines of:

  • Packages that are cached from nuget.org should be marked somehow (e.g. nuget source address)
  • A setting that controls how many versions of "cached" nuget.org packages will be kept (e.g. max 5)
  • If a new version of a nuget.org package is cached, check if there are more than 5 versions present, if yes, evict the oldest.

This would keep "manually" uploaded/pushed packages safe, and would evict old unused versions of cached packages.

This issue is stale because it has been open for 90 days with no activity. Remove the stale label, comment, or this will be closed in 5 days.

.