Soluto/kamus

Operation returned an invalid status code 'Forbidden'

mrfelton opened this issue · 10 comments

Describe the bug

Hi. My kamus-controller workload is in a crash loop with the following:

{
 insertId: "saxa15st9g1bwertn"  
 jsonPayload: {
  Exception: "Microsoft.Rest.HttpOperationException: Operation returned an invalid status code 'Forbidden'
   at k8s.Kubernetes.ListClusterCustomObjectWithHttpMessagesAsync(String group, String version, String plural, String fieldSelector, String labelSelector, String resourceVersion, Nullable`1 timeoutSeconds, Nullable`1 watch, String pretty, Dictionary`2 customHeaders, CancellationToken cancellationToken)
   at CustomResourceDescriptorController.utils.KubernetesExtensions.<>c__DisplayClass0_0`1.<<ObserveClusterCustomObject>b__0>d.MoveNext() in /app/crd-controller/utils/KubernetesExtensions.cs:line 20"   
  Level: "Error"   
  MessageTemplate: "Unexpected error occured while watching KamusSecret events"   
  Properties: {…}   
  Timestamp: "2019-10-03T12:11:52.7407233+00:00"   
 }

It might be worth noting that I could not get the installation to work as per the documentation, which asks to run the command:

helm upgrade --install kamus soluto/kamus -f values.yaml --set-string keyManagement.googleKms.credentials="$(cat credentials.json | base64)"

Which fails with the following:

UPGRADE FAILED
Error: YAML parse error on shared/charts/kamus/templates/secret.yaml: error converting YAML to JSON: yaml: line 11: could not find expected ':'
Error: UPGRADE FAILED: YAML parse error on shared/charts/kamus/templates/secret.yaml: error converting YAML to JSON: yaml: line 11: could not find expected ':'

The only way that I could get it to install was to run cat credentials.json | base64 and then remove all the newlines from the output and to my helm values like so:

kamus:
  image:
    version: 0.5.2.0
  keyManagement:
    provider: GoogleKms
    googleKms:
      projectId: [REDACTED]
      location: [REDACTED]
      keyRing: [REDACTED]
      protectionLevel: SOFTWARE
      credentials: ewogICJ0eXBlIjog.......

and then deploy like:

helm upgrade shared . --install --debug --namespace ${ENVIRONMENT} --values values.yaml --recreate-pods --force

Versions used
Kamus (API images): 0.5.2.0
Kamus CLI: 0.2.3
Chart version:0.4.1
KMS provider: Google KMS
Kubernetes flavour and version: (e.g. OpenShift Origin 3.9): GKE

Expected behavior

Container should start up without issue.

Also worth noting that I don't have any KamusSecret resources

And that I'm running on Kubernetes 1.14.6-gke.2

Can you copy the serviceAccount field in kamus-controller deployment and the cluster role associated with this serviceAccount, please?

serviceAccount: kamus-controller
serviceAccountName: kamus-controller
kubectl describe sa kamus-controller -n shared
Name:                kamus-controller
Namespace:           shared
Labels:              <none>
Annotations:         <none>
Image pull secrets:  <none>
Mountable secrets:   kamus-controller-token-mvtz7
Tokens:              kamus-controller-token-mvtz7
Events:              <none>

It might be worth mentioning that I have 2 kamus instances setup currently:

  1. one in the default namespace which is not using Google KMS (this one is working fine)
  2. another one in the shared namespace, which is using Google KMS (the one I'm having trouble with)

Here is the service account from the default namespace (the one that's working). Looks a little different.

$ kubectl describe sa kamus-controller
Name:                kamus-controller
Namespace:           default
Labels:              app.kubernetes.io/instance=shared
Annotations:         kubectl.kubernetes.io/last-applied-configuration:
                       {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"labels":{"app.kubernetes.io/instance":"shared"},"name":"kamus...
Image pull secrets:  <none>
Mountable secrets:   kamus-controller-token-kcxr9
Tokens:              kamus-controller-token-kcxr9
Events:              <none>

Using helm template and dummy values:

# Source: kamus/templates/secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: kamus
type: Opaque
data:
  appsettings.secrets.json: CnsKfQ==

  googlecloudcredentials.json: ewogICJ0eXBlIjog

I can see that line 11 is the googlecloudcredentials.json. I guess there is some invalid char there, so maybe you can try using the cat | base64 approach and see if it helps?

I'm a little unclear what you are suggesting here.

The only way I can get these templates to apply is by using cat credentials.json | base64 and then manually remove all the newlines from the output. However, I think there could have been a problem because that results in a string that includes a newline at the end.

I updated to use cat credentials.json | tr -d \\n | base64 and this gives me the correctly encoded credentials, with no trailing newlines.

Do you have a better way of applying the credentials? Or do you know why the method suggested in the docs doesn't work correctly? I suspect t's due to a difference between how base64 works on mac vs other platforms. What platform are you operating on?

However, even after using updates encoded value, I still get the same error.

Do you get any pointers from the message:

Unexpected error occured while watching KamusSecret events"

You don't think this is an issue due to my GKE version or something? Nothing to do with the fact that I have 2 kamus deployments in the same cluster, each using difference providers?

Oh yeah, sorry - I was barking on the wrong tree. Can you please show the relevant ClusterRoleBinding? I'm not sure how 2 kamus deployments going to work, as the controller is looking on KamusSecrets in all namespaces and it's going to end with a race condition (both get the event, one succeed in creating a secret). What you can do (temporary) is delete one controller deployment and see if that helps.

Here are the ClusterRoleBinding entries looked.

tom$ kubectl describe clusterrolebinding kamus
Name:         kamus
Labels:       app.kubernetes.io/instance=zap-shared
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRoleBinding","metadata":{"annotations":{},"labels":{"app.kubernetes.io/instanc...
Role:
  Kind:  ClusterRole
  Name:  kamus
Subjects:
  Kind            Name   Namespace
  ----            ----   ---------
  ServiceAccount  kamus  default

tom$ kubectl describe clusterrolebinding kamus-controller
Name:         kamus-controller
Labels:       app.kubernetes.io/instance=zap-shared
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRoleBinding","metadata":{"annotations":{},"labels":{"app.kubernetes.io/instanc...
Role:
  Kind:  ClusterRole
  Name:  kamus-controller
Subjects:
  Kind            Name              Namespace
  ----            ----              ---------
  ServiceAccount  kamus-controller  default

As a test, I deleted the other kamus instance that was working (the one using AES) and redeployed the Google KMS one - and now it's working currently.

So, this poses a challenge. I really wanted to be able to use AES for one namespace (development) and GKMS for another namespace (production). This would mean I could easily deploy the dev namespace locally into minikube since I would be able to use the same AES encryption key both locally and in GKE.

Do you see this as something that could be fixed (ability to run multiple instance) or is this assumption of a single instance of Kamus baked into the design of the system?

Yes, we can add support for that - it's basically changing the controller watch action to watch a specific namespace. Not that complicated - a PR will be appreciated :)

Closing as stale, let us know if you want to reopen and contribute.