kubernetes-sigs/controller-tools

Generated OpenAPI schema for metav1.LabelSelector accepts invalid input

MikeSpreitzer opened this issue · 11 comments

When I use controller-gen to generate a CustomResourceDefinition from my commented Go definition of an object type that uses metav1.LabelSelector, the OpenAPI schema for the LabelSelector only says that the label names and values must be strings. But the strings that are ultimately allowed in those positions are rather restricted (see https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set).

While I agree with your assessment and that this could be better, I think this is something that, as a community, we probably don't want to fix yet.

Making a change to the way that the schema is generated now would create a breaking change in any consumers API that uses the label selector today and has already shipped.

Until we can fix this, you should be able to use a CEL validation for any label selector you want to add as a new field and then validate the fields within it using a regex (or combination of regexes) that enforces the correct format.

In the future, we may be able to fix this at the controller tools layer. Kube is introducing ratcheting validation and, once that is present and stable, we may be able to find a way to make this a backwards compatible change.

This will need to be done carefully though so that all supported versions of Kube at the time support the ratcheting validation, else we will still run the risk of users generating breaking changes in their APIs.

I think sticking a pin in this for a couple of years might be the best course of action for now.

Agree!

This will need to be done carefully though so that all supported versions of Kube at the time support the ratcheting validation, else we will still run the risk of users generating breaking changes in their APIs.

Wondering if this is enough to be honest. We know that a lot of people are running unsupported versions and I don't know if we want to break them via controller-gen. So maybe we want to provide a flag to opt-out then?

Perhaps a "minimum compatible version" flag of some description that triggers features/generation changes like this.

It would be up to a project to decide their minimum level and generate their CRDs against that version.

That said, do we not declare a compatibility matrix between controller tools versions and the KAS that the CRD is to be installed on? How did we handle the v1beta1 to v1 API extensions group change for example?

I do not understand the concern. The restrictions on label names and values have been in place for a long time. I do not understand how any shipped products can be working with invalid labels.

Restricting validation at admission time is a breaking change no matter what happens on the backend.

While yes, your use case of a label selector means that parsing it results in an error, you could in theory parse it in another way that means that the valid values are different (see upstream API guidelines about not re-using other people's types).

You also have to bear in mind that changing the validation will immediately break all writes to a currently invalid object. Which means that not only will users not be able to update the object unless they fix the error, but even status updates from controllers will fail. Outwardly, this can have very bad consequences such as "Oh the object actually looks good" when in fact it is very not good.

When ratcheting validation comes in, that latter problem is resolved. Only writes to the broken field will fail, writes to the rest of the object would succeed. This means that controllers can continue to operate, even if the spec is invalidated because of an updated validation

I see, thanks for the explanation.

It seems backwards to hurt developers who are trying to use LabelSelectors exactly as they are designed and intended, for the sake of old stuff that abuses them. Could there be a way to tell controller-gen "hey, I am using labels and label selectors in the normal way and want full validation"?

I think it comes down to doing the right thing per default and allowing a way to opt-out with a flag or something.