Tag list API should include pagination
syed opened this issue · 12 comments
The current [https://github.com/opencontainers/distribution-spec/blob/main/spec.md#listing-tags](tag list API spec) does not have a limit on the response from the registry side. This exposes the registry to a potential DDoS when a client requests tags for a repo that has a lot of tags.
Right now all major registries have pagination and wont return the full tag list. From the discussion on the OCI call, it seems like there was language around pagination which was accidentally deleted. Can we please add the pagination back into the tag list spec
As you've alluded to, this is one of the things that was part of the spec, but the language was (accidentally?) removed (as noted in #443 🙈) -- looking at the history and adding back some appropriate pagination references would probably be appreciated (although to be clear, I am not a maintainer of the distribution spec, only image/runtime 😇).
The text we missed appears to be:
The implementation MAY impose a maximum limit and return a partial set with pagination links.
Unfortunately, with the 1.0 release, we have the following text:
<name>
is the namespace of the repository, and<int>
is an integer specifying the number of tags requested. The response to such a request MAY return fewer than<int>
results, but only when the total number of tags attached to the repository is less than<int>
. Otherwise, the response MUST include<int>
results.
I don't think there's an easy way to combine the "MUST include <int>
results" with "MAY impose a maximum limit and return a partial set" without breaking clients. Allowing the server to ignore the count and return a previously undocumented link header will result in clients that believe they have reached the end of the list prematurely (the returned count is less than the requested limit).
If we are going to break clients, I'd prefer to do that in a way that explicitly fails, with an error code response and a 400 status code, to a request that exceeds a limit on the server. Similar to the maximum manifest size, we could document a limit that should be supported by registries, and that clients should not exceed, for interoperability. Registries could elect to support more than that, particularly for legacy clients that do not specify a count on a large repo. And if a link header is seen, clients can either use that or use the legacy behavior of creating the next request using the <last>
parameter.
Does this seem like a feasible path forward to others, given the text of the 1.0 spec?
in the spec, the next sentence reads
When n is zero, this endpoint MUST return an empty list, and MUST NOT include a Link header.
it looks like at some point there was a Link
header. Does this mean when n !=0
a Link header MAY be included?
We use the Link
header with last
param because it has less chance of being abused. An offset based pagination relies on using OFFSET
in a database query which has worse performance as the offset number increases because the DB has to pretty much read all the rows to get to that offset. Admittedly, this is implementation dependent but most registries use some DB in their backend.
Does this mean when n !=0 a Link header MAY be included?
It may be included since there's nothing saying it may not be included, but there's no requirement for clients to use it.
Also hitting various issues with this. You basically can't write a compliant implementation of "listing tags" that works in practice: every registry implements pagination differently now.
Can this issue please be fixed? Related: #116.
Agreed. I'd propose that Link
usage should be encouraged but not mandatory (to not break older clients) but n
should not be unbound. A registry should be allowed to respond with an error if n
is too high and only return less than n
if the client paginated to the last result page.
only return less than n if the client paginated to the last result page.
Uh, that would be really bad: clients want to maximize page size, servers want to minimize.
Is a client going to bisect the server for the largest supported n
?
I think you're saying this because you want to rely on "less than n results" signals "end of data", which is necessary cause old clients do not hint (with Link
headers) there is more data to fetch, correct?
The much more obvious solution is to let the server return at most n
results, and put the piece of information that "more tags are available" in the JSON response. That way you can distinguish between OCI v1.1 registries that did not respond with Link
headers and OCI v1.2. It also follows what folks have done in practice when implementing registries, and it was suggested a long time ago. If you send next-token: false / string
it also solves the unnecessary second request in the ambiguous but unlikely case of having exactly n
results on the server and n
requested.
I would recommend against Link
headers. Sure, they are standard. But it's incredibly likely that it will be implemented in a non-conforming way to RFC 5988. Notice how you're supposed to respond with Link: < IRI >
where IRI is a special encoding of a URL. I've already seen a case where the <
and >
bits were omitted. Given that the OCI spec already violates the RFC 7233 with regard to Content-Range
I expect that the same will happen with Link: ...
.
For clients it's also much easier to parse a bit of JSON. RFC-conforming header value parsers are rare, and made extra difficult since servers may not always exactly conform to it.
Sounds reasonable. The main reason why I am proposing not to have the client be able to dictate n
is that this can be a very expensive operation for a large-scale registry service. Imagine a client wants to have a page size of 1000 - this could be used as an attack vector to rapidly request expensive tag lists of crowded repos, that are hard to counter with traditional API rate limiting, because the cost of each additional API call varies greatly.
The much more obvious solution is to let the server return at most
n
results, and put the piece of information that "more tags are available" in the JSON response.
That would only work for new clients going forward but break existing clients that are following older versions of the spec in a non-explicit way (giving a false indication that all tags have been pulled). One of the challenges being addressed is there are a lot of existing clients, including written with a simple curl command, that are not aware they are receiving a partial list when they request all tags, and those that implement pagination that depend of the registry returning less than a full list to indicate the last page of results.
#470 opts for an explicit failure when clients exceed the limit so the need to update an implementation to work with the registry is clear. And it provides a header on the failure to allow clients to gracefully downgrade to the maximum allowable number of tags.
If registries support returning more tags than requested, I think using the link header would be an acceptable way for registries to return that in the later responses (first response would be 1-100 tags, and the link header would point to 101-1100). Support for that link header is also optional, and I'd expect a spec conforming client that sees an invalid link header to ignore it and default to the pagination args in the URL instead.
I would not consider it very breaking.
- Any client that conforms strictly to the spec and assumes all tags are listed already fails on various registries that implement pagination due to an oversight in the spec. Follow the spec and you have a broken client in practice.
- If you change the spec to say that registries may limit results and must support pagination, then registries can choose to continue to return all results if they care about not breaking old clients.
You could even see it as a security bug embedded in the current spec (no pagination may result in denial of service), so fixing that problem is non-breaking.
It makes absolutely no sense to let clients guess how many tags they can fetch from an arbitrary registry.
It makes absolutely no sense to let clients guess how many tags they can fetch from an arbitrary registry.
#470 removes all the guessing, clients are told how many tags a registry should support, and the registry tells the client how many it does support if the client requests too many.