performance: what can dist-spec do to improve downloads of large images/layers?
rchincha opened this issue · 7 comments
Things we already allow/do:
- Parallel download of layers
Things we can likely improve:
- For large layers, range-based downloads - download sections of a large file using Range Header and stitch them back together?
Things known to the community:
- https://github.com/containerd/stargz-snapshotter/blob/main/docs/estargz.md
^ fuse based filesystem solutions - copy individual things when referenced
For streaming/lazy loading:
Multipart layer downloads with range requests:
- containerd/containerd#10177
- https://github.com/awslabs/amazon-ecr-containerd-resolver#parallel-downloads
Direct mounts of compressed tars (saves on extraction time):
What changes are needed in distribution-spec to support this? Is a pointer to the HTTP specs documenting range requests enough?
What changes are needed in distribution-spec to support this? Is a pointer to the HTTP specs documenting range requests enough?
IMO, what server/client side optimizations can be enabled by dist-spec changes? demonstrably?
Just like conformance, we should write benchmark code for this.
Just like conformance, we should write benchmark code for this.
Is that an OCI requirement, or something implementations should be doing?
Just like conformance, we should write benchmark code for this.
Is that an OCI requirement, or something implementations should be doing?
Not an OCI requirement - our conformance suite should already ensure conformant registries can handle range-based blob pulls.
#537 (comment)
^ will be interesting to see data from various registries. If viable, clients should move to this model advised by blob size.