distribution/distribution

proxy: Switch from `proxy.ttl=0` to `proxy.ttl > 0` leaks blobs that never get garbage collected

ialidzhikov opened this issue · 2 comments

Description

We are providing a service that deploys and manages pull through caches.
We are using the storage.delete.enabled option to denote whether the "garbage collection of images" for the pull through cache is enabled or not.
This setting on our side is mutable - users can control whether the GC (under the hood the storage.delete.enabled option) is enabled or not.
However there is the following case where the migration doesn't behave as expected. If you run a proxy with storage.delete.enabled=false, then it adds entries to the scheduler-state.json file, after ttl is passed, it tries to remove the blobs/manifests. When storage.delete.enabled=false blob doesn't get removed but the corresponding entry from the scheduler-state.json file gets removed. Hence, the corresponding blob could never be garbage collected if proxy is restarted to run with storage.delete.enabled=true.

Reproduce

  1. Run a registry in proxy mode with upstream for and with GC disabled (storage.delete.enabled=false). The following config is used:

  2. Pull the alpine:3.14.0 image from the registry proxy.

# ctr -n k8s.io images pull docker.io/library/alpine:3.14.0

Make sure your scheduler-state.json file looks similar to:

{
    "library/alpine@sha256:1dc785547989b0db1c3cd9949c57574393e69bea98bfe044b0588e24721aa402": {
        "Key": "library/alpine@sha256:1dc785547989b0db1c3cd9949c57574393e69bea98bfe044b0588e24721aa402",
        "ExpiryData": "2024-01-17T10:40:19.747823626Z",
        "EntryType": 0
    },
    "library/alpine@sha256:51b67269f354137895d43f3b3d810bfacd3945438e94dc5ac55fdac340352f48": {
        "Key": "library/alpine@sha256:51b67269f354137895d43f3b3d810bfacd3945438e94dc5ac55fdac340352f48",
        "ExpiryData": "2024-01-17T10:40:18.48521875Z",
        "EntryType": 1
    },
    "library/alpine@sha256:a70bcfbd89c9620d4085f6bc2a3e2eef32e8f3cdf5a90e35a1f95dcbd7f71548": {
        "Key": "library/alpine@sha256:a70bcfbd89c9620d4085f6bc2a3e2eef32e8f3cdf5a90e35a1f95dcbd7f71548",
        "ExpiryData": "2024-01-17T10:40:18.959229542Z",
        "EntryType": 1
    },
    "library/alpine@sha256:c303524923177661067f7eb378c3dd5277088c2676ebd1cd78e68397bb80fdbf": {
        "Key": "library/alpine@sha256:c303524923177661067f7eb378c3dd5277088c2676ebd1cd78e68397bb80fdbf",
        "ExpiryData": "2024-01-17T10:40:20.352161126Z",
        "EntryType": 0
    }
}
  1. Edit the scheduler-state.json file: set one of the blob layers to expire for by updating the ExpiryData field. Let's do it for the library/alpine@sha256:1dc785547989b0db1c3cd9949c57574393e69bea98bfe044b0588e24721aa402 blob.

Then restart the proxy. When the expiry data is passed, then it should try to delete the blob:

time="2024-01-10T10:45:19.748260501Z" level=error msg="Scheduler error returned from OnExpire(library/alpine@sha256:1dc785547989b0db1c3cd9949c57574393e69bea98bfe044b0588e24721aa402): operation unsupported" go.version=go1.20.8 instance.id=1fceb0ff-4044-4e41-8119-0584e06cdba4 service=registry version=2.8.3

The blob is as expected not removed because of the storage.delete.enabled=false option.
However the corresponding entry gets removed from the scheduler-state.json.

{
    "library/alpine@sha256:51b67269f354137895d43f3b3d810bfacd3945438e94dc5ac55fdac340352f48": {
        "Key": "library/alpine@sha256:51b67269f354137895d43f3b3d810bfacd3945438e94dc5ac55fdac340352f48",
        "ExpiryData": "2024-01-17T10:40:18.48521875Z",
        "EntryType": 1
    },
    "library/alpine@sha256:a70bcfbd89c9620d4085f6bc2a3e2eef32e8f3cdf5a90e35a1f95dcbd7f71548": {
        "Key": "library/alpine@sha256:a70bcfbd89c9620d4085f6bc2a3e2eef32e8f3cdf5a90e35a1f95dcbd7f71548",
        "ExpiryData": "2024-01-17T10:40:18.959229542Z",
        "EntryType": 1
    },
    "library/alpine@sha256:c303524923177661067f7eb378c3dd5277088c2676ebd1cd78e68397bb80fdbf": {
        "Key": "library/alpine@sha256:c303524923177661067f7eb378c3dd5277088c2676ebd1cd78e68397bb80fdbf",
        "ExpiryData": "2024-01-17T10:40:20.352161126Z",
        "EntryType": 0
    }
}

At this point the blob leaks on the file system and the proxy can no longer garbage collect it if storage.delete.enabled=true gets set to true in the future.
The corresponding source code is

func (ttles *TTLExpirationScheduler) startTimer(entry *schedulerEntry, ttl time.Duration) *time.Timer {
return time.AfterFunc(ttl, func() {
ttles.Lock()
defer ttles.Unlock()
var f expiryFunc
switch entry.EntryType {
case entryTypeBlob:
f = ttles.onBlobExpire
case entryTypeManifest:
f = ttles.onManifestExpire
default:
f = func(reference.Reference) error {
return fmt.Errorf("scheduler entry type")
}
}
ref, err := reference.Parse(entry.Key)
if err == nil {
if err := f(ref); err != nil {
dcontext.GetLogger(ttles.ctx).Errorf("Scheduler error returned from OnExpire(%s): %s", entry.Key, err)
}
} else {
dcontext.GetLogger(ttles.ctx).Errorf("Error unpacking reference: %s", err)
}
delete(ttles.entries, entry.Key)
ttles.indexDirty = true
})
}
.

Expected behavior

I would expect the switch from storage.delete.enabled=false to storage.delete.enabled=true to garbage collect all blobs/manifests on time.Now() + ttl where time.Now() is the switch to storage.delete.enabled=true (enablement of the garbage collection).

registry version

# registry --version
registry github.com/docker/distribution 2.8.3

Additional Info

No response

I also wanted to chat about the storage.delete.enabled=false field. I feel that it is not designed for the purpose we use it. The source code in proxy gives the impression that it is designed to run always with GC enabled.
If proxy has the strict requirement storage.delete.enabled to be always true, then this can be validated on startup and proxy could fail to start if the field is set to false.
On the other hand, I think it is useful to have a setting that allows to control whether GC is enabled or not.