canonical/notebook-operators

controllers.Culler logs 403 on a request to /api/kernels

Closed this issue · 2 comments

Bug Description

Notebook Culling is not working after upgrading the kubeflow-profiles image to 1.8.0-rc.2 in PR canonical/kubeflow-profiles-operator#155 , due to the AuthorizationPolicy applied in the profile's namespaces being:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  annotations:
    role: admin
    user: admin
  creationTimestamp: "2023-10-20T11:36:24Z"
  generation: 1
  name: ns-owner-access-istio
  namespace: admin
  ownerReferences:
  - apiVersion: kubeflow.org/v1
    blockOwnerDeletion: true
    controller: true
    kind: Profile
    name: admin
    uid: 4ac4ba90-d64f-470b-8562-d965b4dee3f1
  resourceVersion: "70076"
  uid: 76fde5db-f822-40ac-bb4d-68db474e17e3
spec:
  rules:
  - from:
    - source:
        principals:
        - cluster.local/ns/kubeflow/sa/istio-ingressgateway-workload-service-account
        - cluster.local/ns/kubeflow/sa/kfp-ui
    when:
    - key: request.headers[kubeflow-userid]
      values:
      - admin
  - when:
    - key: source.namespace
      values:
      - admin
  - to:
    - operation:
        paths:
        - /healthz
        - /metrics
        - /wait-for-drain
  - from:
    - source:
        principals:
        - cluster.local/ns/kubeflow/sa/jupyter-controller
    to:
    - operation:
        methods:
        - GET
        paths:
        - '*/api/kernels'

To Reproduce

  1. Deploy kubeflow latest/edge
  2. juju refresh kubeflow-profiles --channel=latest/edge/pr-155 --resource profile-image=docker.io/kubeflownotebookswg/profile-controller:v1.8.0-rc.2 --resource kfam-image=docker.io/kubeflownotebookswg/kfam:v1.8.0-rc.2

Environment

juju 3.1/stable
microk8s 1.25-strict/stable

Relevant log output

[jupyter-controller] 1.6977991224720206e+09    INFO    controllers.Culler    Warning: GET to http://mynb.admin.svc.cluster.local/notebook/admin/mynb/api/kernels: 403

Additional context

No response

the issue here is that the jupyter-controller pod does not have an istio sidecar, so it is not using mTLS authentication. mTLS is needed because the AuthorizationPolicy specifies a source.principal, from the istio docs:

This field requires mTLS enabled

this issue is related to canonical/kfp-operators#355

what is blocking the Culling is specifically this rule in the AuthorizationPolicy applied by kubeflow-profiles workload:

  - from:
    - source:
        principals:
        - cluster.local/ns/kubeflow/sa/jupyter-controller
    to:
    - operation:
        methods:
        - GET
        paths:
        - '*/api/kernels'

to fix this, we need a rule that doesn't check on the source of the request, so the new rule should be:

  - to:
    - operation:
        methods:
        - GET
        paths:
        - '*/api/kernels'