/node-observability-operator

An Operator that will be able to gather debugging/profiling data over a custom period of time which would be helpful to troubleshoot and resolve issues for OpenShift customers.

Primary LanguageGoApache License 2.0Apache-2.0

NodeObservability Operator

The NodeObservability Operator allows you to deploy and manage NodeObservability Agent on worker nodes. The agent is deployed through DaemonSets on all or selected nodes. It also triggers the crio and kubelet profile data to the nodes hostPath for later retrieval.

Note: This Operator is in the early stages of implementation and keeps changing.

Deploying the NodeObservability Operator

You can deploy the NodeObservability Operator for a BareMetal installation by using the following procedure:

Installing the NodeObservability Operator

You can install the NodeObservability Operator by building and pushing the Operator image into a registry.

  1. To build and push the Operator image into a registry, run the following commands:
    # set the envar CONTAINER_ENGINE to the preffered container manager tool (default is podman)
    $ export IMG=<registry>/<username>/node-observability-operator:latest
    $ make container-build
    $ make container-push
  2. To deploy the NodeObservability Operator, run the following command:
    $ make deploy
    
    If you want to specify the agent image of your choice, patch the operator deployment with the following command:
    $ oc set env deployment/node-observability-operator --containers=manager RELATED_IMAGE_AGENT=${MY_IMAGE_AGENT} -n node-observability-operator

Creating the local NodeObservability

  1. To create make targets, run the following command:
    $ make install
    $ make run
    # In another terminal execute the sample CR
    $ oc apply -f /config/samples/nodeobservability_v1alpha1_nodeobservability-all.yaml
    # Alternatevely you can create a new CR and change the fields accordingly

Installing the Node Observability Operator using a custom index image on the OperatorHub

Note: It is recommended to use podman as a container engine.

Prerequisites

  • Openshift Container Platform cluster (CodeReady Containers for development).

Procedure

  1. Build and push the Operator image to the registry:

    $ export IMG=${REGISTRY}/${REPOSITORY}/node-observability-operator:${VERSION}
    $ make container-build container-push
  2. Build and push the bundle image to the registry:

    a. Add the created Operator image in the node-observability-operator_clusterserviceversion.yaml file:

    $ sed -i "s|quay.io/openshift/origin-node-observability-operator:latest|${IMG}|g" bundle/manifests/node-observability-operator.clusterserviceversion.yaml

    b. Build the image:

    $ export BUNDLE_IMG=${REGISTRY}/${REPOSITORY}/node-observability-operator-bundle:${VERSION}
    $ make bundle-build bundle-push
  3. Build and push the index image to the registry:

    $ export INDEX_IMG=${REGISTRY}/${REPOSITORY}/node-observability-operator-bundle-index:${VERSION}
    $ make index-image-build index-image-push
  4. (Optional) If the image is not made public, then you have to link the registry secret to the pod of the node-observability-operator created in the openshift-marketplace namespace:

    a. Create a secret with authentication details of your image registry:

    $ oc -n openshift-marketplace create secret generic nodeobs-olm-secret  --type=kubernetes.io/dockercfg  --from-file=.dockercfg=${XDG_RUNTIME_DIR}/containers/auth.json

    b. Link the secret to the default service account:

    $ oc -n openshift-marketplace secrets link default nodeobs-olm-secret --for=pull
  5. Create the CatalogSource object:

    cat <<EOF | oc apply -f -
    apiVersion: operators.coreos.com/v1alpha1
    kind: CatalogSource
    metadata:
      name: node-observability-operator
      namespace: openshift-marketplace
    spec:
      sourceType: grpc
      image: ${INDEX_IMG}
    EOF
    
  6. Create the Operator namespace:

    $ oc create namespace node-observability-operator

From the CLI

  1. Create the OperatorGroup object to scope the Operator to node-observability-operator namespace:

    cat <<EOF | oc apply -f -
    apiVersion: operators.coreos.com/v1
    kind: OperatorGroup
    metadata:
      name: node-observability-operator
      namespace: node-observability-operator
    spec:
      targetNamespaces:
      - node-observability-operator
    EOF
    
  2. Create the Subscription object:

    cat <<EOF | oc apply -f -
    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: node-observability-operator
      namespace: node-observability-operator
    spec:
      channel: alpha
      name: node-observability-operator
      source: node-observability-operator
      sourceNamespace: openshift-marketplace
    EOF
    

    If you want to specify the agent image of your choice, use the following subscription:

    cat <<EOF | oc apply -f -
    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: node-observability-operator
      namespace: node-observability-operator
    spec:
      channel: alpha
      name: node-observability-operator
      source: node-observability-operator
      sourceNamespace: openshift-marketplace
      env:
      config:
        env:
          - name: RELATED_IMAGE_AGENT
            value: ${MY_IMAGE_AGENT}
    EOF
    

From the UI

To install the Node Observability Operator from the web console, follow these steps:

  1. Log in to the OpenShift Container Platform web console.

  2. Navigate to Operators → OperatorHub.

  3. Type Node Observability Operator into the filter box and select it.

  4. Click Install.

  5. On the Install Operator page, select a specific namespace on the cluster. Select node-observability-operator from the drop-down menu.

Once finished, the Node Observability Operator will be listed in the Installed Operators section of the web console.

Verification

  • Use the following commands to verify that the Node Observability Operator has been installed.
$ oc get catalogsource -n openshift-marketplace
$ oc get operatorgroup -n node-observability-operator
$ oc get subscription -n node-observability-operator