openshift/machine-config-operator

layering: pod-like MachineImage idea

cgwalters opened this issue · 4 comments

I've been trying to sketch this out in chats around OCP CoreOS Layering.

Right now the API to interact is basically "override osImageURL" - all the rest of the MCO machinery then works in the same way it does today. Most notably on this topic for example, the MCO (node controller) still owns choosing the node rollout order (i.e. which nodes get the update in what order). This relates to #3009 as well as #2163 etc.

Once we have the "build controller" aspect of the MCO; actually ideally once we have #3137 - the admin could instead even more fully "take the wheel" and own creating something that is very much like a cut-down Pod to a node or a node selector, which contains the OS image they want to use:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineImage
metadata:
  name: worker-05-image
spec:
  image: quay.io/examplecorp/rhel8-worker@sha256:...
  nodeName: worker-05

Here we're using nodeName to assign an image to a particular node.

But, it should work to use all standard mechanisms to assign pods to nodes, for example node selectors:

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineImage
metadata:
  name: worker-image
spec:
  image: quay.io/examplecorp/rhel8-worker@sha256:...
    matchLabels:
      node-role.kubernetes.io/worker: ""

I'd strawman out that once a MachineImage CRD is created, the MCO stops rolling out images on its own for all nodes in the pool which has any matching MachineImage or so.

Pull secret

Also, we should add a pull secret to this CRD which allows configuring that.

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.