OCP NFD not serving nfd.k8s-sigs.io/v1alpha1

Question

OCP NFD not serving nfd.k8s-sigs.io/v1alpha1

mythi opened this issue 2 years ago · 26 comments

What happened:

I maintain a set of NodeFeatureRules:

apiVersion: nfd.k8s-sigs.io/v1alpha1
kind: NodeFeatureRule
metadata:
  name: intel-dp-devices

but this fails to deploy on Openshift:

The server doesn't have a resource type "kind: NodeFeatureRule, apiVersion: nfd.k8s-sigs.io/v1alpha1

What you expected to happen:
I can use my NodeFeatureRule on both openshift and vanilla kubernetes without patching/maintaining cluster specific copies of the same content.

How to reproduce it (as minimally and precisely as possible):
See my NodeFeatureRule

Answer 1 · 2022-05-02T15:06:31.000Z

Thanks for reporting this @mythi. This has somehow avoided my radar as it's quite new feature.

The API group should definitely match the NFD operand upstream. Also, importantly, we should not be using *.k8s.io (or kubernetes.io) API domain when/if we haven't gone through the K8s API review. If we had, we should also have the corresponding api-approved.kubernetes.io: https://github.com/kubernetes/enhancements/pull/<kep> annotation in place.

/assign @ArangoGutierrez

Answer 2 · 2022-05-10T13:54:34.000Z

we are using k8s-sigs for upstream, and I have set https://github.com/openshift/cluster-nfd-operator/blob/master/config/crd/bases/nfd.openshift.io_v1alpha1_nodefeaturerules.yaml#L9 openshift for downstream. but you have a good point, said diff could break interoperability among users

Answer 3 · 2022-08-08T14:43:08.000Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Answer 4 · 2022-08-08T19:33:30.000Z

/remove-lifecycle stale

Answer 5 · 2022-11-06T19:58:29.000Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Answer 6 · 2022-12-06T20:21:32.000Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Answer 7 · 2022-12-08T09:46:43.000Z

/remove-lifecycle rotten

Answer 8 · 2023-01-17T13:33:07.000Z

I no longer maintain the OpenShift downstream version
/assign @yevgeny-shnaidman

Answer 9 · 2023-04-17T14:10:25.000Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Answer 10 · 2023-04-17T14:12:51.000Z

/remove-lifecycle stale

Answer 11 · 2023-04-17T17:44:00.000Z

PING @yevgeny-shnaidman

Answer 12 · 2023-04-18T09:07:10.000Z

@mythi which NFD version are you deploying on the OCP cluster: upstream, or OCP version?

Answer 13 · 2023-04-18T10:30:31.000Z

The idea of this issue is that users can easily move from upstream to OCP and back without having to edit their CRD's due to the diff on the API name

see https://github.com/openshift/node-feature-discovery/blob/master/deployment/base/nfd-crds/nfd-api-crds.yaml#L8

  creationTimestamp: null
  name: nodefeatures.nfd.openshift.io
spec:
  group: nfd.openshift.io

it says openshift instead of apiVersion: nfd.k8s-sigs.io , so users are forced to have 2 set's of CR's adding some maintenance complexity

Answer 14 · 2023-04-18T11:00:26.000Z

i understand that, and i am not against the idea, but it also means that current OPC NFD customers will have to change their code/deployment.

Answer 15 · 2023-04-18T11:06:38.000Z

maybe OCP-NFD can provide a migration path, over a 3 releases span, by adding a flag/config way to enable the upstream API and document that enough so users know they have 3 releases time (which in OCP is like a year or so) to migrate their CR's to the upstream

Answer 16 · 2023-04-18T11:18:07.000Z

@ArangoGutierrez @mythi what about allowing deploying upstream NFD on OCP? I just need to check that it is working

Answer 17 · 2023-04-18T11:21:37.000Z

What do you mean by "allow" ? the only diff from upstream to OCP is the SCC and RBAC bits needed by OCP. everything else works, is not what @mythi is trying to convey here. IMO

Answer 18 · 2023-04-18T12:01:46.000Z

What mean is that instead of installing OCP NFD, @mythi can install upstream NFD on OCP , and that way can continue using his NodeFeatureRule yaml without any change. I just need to make sure that upstream NFD installation on OCP works. that way he can have immediate solution to his issue.

Answer 19 · 2023-04-18T12:23:24.000Z

What mean is that instead of installing OCP NFD, @mythi can install upstream NFD on OCP

One thing to clarify is that it's not what I can do or cannot do. If I provide my device plugin users some NodeFeatureRules, they should just run.

Answer 20 · 2023-04-18T12:33:48.000Z

ok @mythi , i understand your use-case now, we will check how to propagate it

Answer 21 · 2023-07-17T12:44:33.000Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Answer 22 · 2023-07-17T13:08:55.000Z

/remove-lifecycle stale