kubernetes-sigs/kubectl-validate

Panics when `--local-crds` folder contains some invalid CRDs

knutgoetz opened this issue · 4 comments

What happened?

We are running kubectl-validate in CI where --local-crds pointing to a folder of CRDs which were previously fetched from a running cluster with kubectl get crds -o yaml .

❯ kubectl-validate some/resources --local-crds ./crds
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1a15538]
goroutine 1 [running]:
sigs.k8s.io/kubectl-validate/pkg/openapiclient.(*localCRDsClient).Paths(0xc00061a0c0)
        sigs.k8s.io/kubectl-validate/pkg/openapiclient/local_crds.go:108 +0xc78
sigs.k8s.io/kubectl-validate/pkg/openapiclient.compositeClient.Paths({{0xc00034fcb0?, 0x18?, 0x1b65520?}})
        sigs.k8s.io/kubectl-validate/pkg/openapiclient/composite.go:23 +0x111
sigs.k8s.io/kubectl-validate/pkg/openapiclient.overlayClient.Paths({{0x4a05320?, 0xc000450a20?}, 0x0?})
        sigs.k8s.io/kubectl-validate/pkg/openapiclient/overlay.go:49 +0x4c
sigs.k8s.io/kubectl-validate/pkg/validator.New({0x4a05380?, 0xc000450bd0?})
        sigs.k8s.io/kubectl-validate/pkg/validator/validator.go:35 +0x2b
sigs.k8s.io/kubectl-validate/pkg/cmd.(*commandFlags).Run(0xc000482000, 0x0?, {0xc00034fc20, 0x1, 0x3})
        sigs.k8s.io/kubectl-validate/pkg/cmd/validate.go:168 +0x554
github.com/spf13/cobra.(*Command).execute(0xc000356300, {0xc0000500d0, 0x3, 0x3})
        github.com/spf13/cobra@v1.7.0/command.go:940 +0x862
github.com/spf13/cobra.(*Command).ExecuteC(0xc000356300)
        github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3bd
github.com/spf13/cobra.(*Command).Execute(0xc0000061a0?)
        github.com/spf13/cobra@v1.7.0/command.go:992 +0x19
main.main()
        sigs.k8s.io/kubectl-validate/main.go:11 +0x1e

What did you expect to happen?

Don't panic :) Instead give some sort of feedback, that invalid CRDs were provided and ignore them.

How can we reproduce it (as minimally and precisely as possible)?

Create an invalid CRD and run kubectl-validate against it.

mkdir /tmp/crds
cat > /tmp/crds/adapters.config.istio.io.yaml << EOF
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    helm.sh/resource-policy: keep
  name: adapters.config.istio.io
spec:
  conversion:
    strategy: None
  group: config.istio.io
  names:
    categories:
    - istio-io
    - policy-istio-io
    kind: adapter
    listKind: adapterList
    plural: adapters
    singular: adapter
  preserveUnknownFields: true
  scope: Namespaced
  versions:
  - name: v1alpha2
    served: true
    storage: true
    subresources:
      status: {}
EOF
kubectl validate . --local-crds /tmp/crds

Anything else we need to know?

We created these CRDs when we were scraping our clusters with kubectl get crds -o yaml to use them in CI, so there must have been a time, when the API server accepted them.
When debugging the issue, I noticed that this helper function does not throw and error when confronted with the CRD.
This then results in an nil pointer dereference on jsProps.OpenAPIV3Schema here

ss, err := structuralschema.NewStructural(jsProps.OpenAPIV3Schema)

I guess i will delete the CRDs in the cluster in the near future. Funny enough i could have avoided the issue by using kubectl-validate for validating the CRD itself:

❯ kubectl validate /tmp/crds/adapters.config.istio.io.yaml 
/tmp/crds/adapters.config.istio.io.yaml...ERROR
CustomResourceDefinition.apiextensions.k8s.io "adapters.config.istio.io" is invalid: [spec.versions[0].schema.openAPIV3Schema: Required value: schemas are required, spec.preserveUnknownFields: Invalid value: true: cannot set to true, set x-kubernetes-preserve-unknown-fields to true in spec.versions[*].schema instead]
Error: validation failed

I would work on a fix, if desired.

Kubernetes version

$ kubectl version
# not relevant

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.