GoogleCloudPlatform/elcarro-oracle-operator

Configure Affinity and Tolerations for DB pods

nblxa opened this issue · 2 comments

nblxa commented

Is your feature request related to a problem? Please describe.
Oracle recommends adjusting Linux kernel parameters (https://docs.oracle.com/en/database/oracle/oracle-database/19/ladbi/changing-kernel-parameter-values.html#GUID-FB0CC366-61C9-4AA2-9BE7-233EB6810A31). Some of these are not namespaced, for instance, fs.aio-max-nr. The default value on GKE nodes (65536) is lower than the Oracle
recommendation. Therefore it is necessary to control scheduling of Instance DB pods only on specifically prepared nodes with affinity.

In addition, tolerations are quite helpful for achieving workload separation.

Describe the solution you'd like
Parameters related to scheduling can be set on the Instance, and will be propagated by the operator to managed StatefulSets and possibly the Deployments as well.
Example:

apiVersion: oracle.db.anthosapis.com/v1alpha1
kind: Instance
metadata:
  name: mydb
spec:
  cdbName: CDB
  cloudProvider: GCP
  databaseResources:
    requests:
      memory: 4.0Gi
  ...
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: app
            operator: In
            values:
            - oracle
  tolerations:
    - key: app
      value: oracle
      effect: NoSchedule

Alternatively only the nodeAffinity part of the affinity element may be supported.

Describe alternatives you've considered

  1. Adjust kernel parameters on all nodes in the GKE cluster, which may cause conflicts.
  2. Use a mutating admission controller to modify pods belonging to StatefulSets managed by El Carro. The downsize is the increased complexity.

Another possible usage for affinity and tolerations is for targeting specific runtimes like GKEs gvisor nodepool https://cloud.google.com/kubernetes-engine/docs/how-to/sandbox-pods#regular-pod

This is a possible alternative to supporting the statefulset.spec.template.spec.runtimeClassName attribute on pods (which allows gvisor to automatically add appropriate affinity and taints for its nodes).

This has been added: PR for Affinity and PR for Tolerations . Users can now configure Affinity and Tolerations via a PodSpec field in the Instance Spec. This field allows you to configure pod level configurations.
Instance CRD Snippet with Toleration Configuration:


apiVersion: oracle.db.anthosapis.com/v1alpha1 
kind: Instance
metadata:
  name: my-db-3
spec:
  podSpec:
    tolerations:
    - key: "key1"
      operator: "Equal"
      value: "value1"
      effect: "NoSchedule"
  type: Oracle
  version: "19.3"
  retainDisksAfterInstanceDeletion: false
  edition: Enterprise

Instance CRD Snippet with Affinity Configuration:


apiVersion: oracle.db.anthosapis.com/v1alpha1
kind: Instance
metadata:
  name: my-db-3
spec:
  podSpec:
    affinity:
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            preference:
              matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                - gke-cluster4-default-pool-0028fdd4-pw2x
  type: Oracle
  version: "19.3"
  retainDisksAfterInstanceDeletion: false
  edition: Enterprise