k8sgpt-ai/k8sgpt-operator

[Question]: k8sGPT Operator with Gemini Backend not working as expected

ibakshay opened this issue · 0 comments

Checklist

  • I've searched for similar issues and couldn't find anything matching
  • I've included steps to reproduce the behavior

Affected Components

  • K8sGPT (CLI)
  • K8sGPT Operator

K8sGPT Version

v0.3.41

Kubernetes Version

v1.30.0

Host OS and its Version

No response

Steps to reproduce

  1. Have a Kubernetes cluster running the k8sGPT Operator.
  2. Apply the following k8sGPT manifest with google as the backend:
    Name:         k8sgpt-sample
    Namespace:    k8sgpt-operator-system
    Labels:       <none>
    Annotations:  <none>
    API Version:  core.k8sgpt.ai/v1alpha1
    Kind:         K8sGPT
    Metadata:
      Creation Timestamp:  2024-10-12T23:08:41Z
      Finalizers:
        k8sgpt.ai/finalizer
      Generation:        3
      Resource Version:  20685
      UID:               164d7e80-c6df-4e09-8836-c1deabc850af
    Spec:
      Ai:
        Anonymized:  true
        Back Off:
          Enabled:      true
          Max Retries:  5
        Backend:        google
        Enabled:        true
        Language:       english
        Model:          gemini-pro
        Secret:
          Key:     google-api-key
          Name:    k8sgpt-sample-secret
      Repository:  ghcr.io/k8sgpt-ai/k8sgpt
      Version:     v0.3.41
    Events:        <none>
  3. Create a Kubernetes secret with your Google API key:
    kubectl create secret generic k8sgpt-sample-secret --from-literal=google-api-key=<Google API Key> -n k8sgpt-operator-system

Expected behaviour

The spec.details field in the generated Result manifest should contain information about the error, not an empty string. Example:

apiVersion: core.k8sgpt.ai/v1alpha1
kind: Result
metadata:
  creationTimestamp: "2024-10-13T19:34:35Z"
  generation: 1
  labels:
    k8sgpts.k8sgpt.ai/backend: google
    k8sgpts.k8sgpt.ai/name: k8sgpt-sample
    k8sgpts.k8sgpt.ai/namespace: k8sgpt-operator-system
  name: defaultnginxdeployment667cff5d68j7vtv
  namespace: k8sgpt-operator-system
  resourceVersion: "1980"
  uid: aad78f92-2473-4521-8169-fcd04279e2db
spec:
  backend: google
  details: "<Error details should be here>"
  error:
  - text: the last termination reason is Error container=nginx pod=nginx-deployment-667cff5d68-j7vtv
  kind: Pod
  name: default/nginx-deployment-667cff5d68-j7vtv
  parentObject: ""
status: {}

Actual behaviour

  • The k8s-sample deployment and its pod are created successfully, and k8sGPT is running without issues.
  • The Result manifests are also created, but the spec.details field is empty. Here’s an example manifest:
apiVersion: core.k8sgpt.ai/v1alpha1
kind: Result
metadata:
  creationTimestamp: "2024-10-13T19:34:35Z"
  generation: 1
  labels:
    k8sgpts.k8sgpt.ai/backend: google
    k8sgpts.k8sgpt.ai/name: k8sgpt-sample
    k8sgpts.k8sgpt.ai/namespace: k8sgpt-operator-system
  name: defaultnginxdeployment667cff5d68j7vtv
  namespace: k8sgpt-operator-system
  resourceVersion: "1980"
  uid: aad78f92-2473-4521-8169-fcd04279e2db
spec:
  backend: google
  details: ""
  error:
  - text: the last termination reason is Error container=nginx pod=nginx-deployment-667cff5d68-j7vtv
  kind: Pod
  name: default/nginx-deployment-667cff5d68-j7vtv
  parentObject: ""
status: {}

Additional Information

I checked the logs of the k8s-sample pod and found the following error:

Request failed. Failed while calling AI provider Google: googleapi: Error 400: * GenerateContentRequest.generation_config.max_output_tokens: max_output_tokens must be positive.

k8sgpt-sample-7c58bdb9d6-b5j2n {"level":"info","ts":1728763974.447422,"caller":"server/server.go:146","msg":"binding metrics to 8081"}
k8sgpt-sample-7c58bdb9d6-b5j2n {"level":"info","ts":1728763974.4475062,"caller":"server/server.go:105","msg":"binding api to 8080"}
k8sgpt-sample-7c58bdb9d6-b5j2n {"level":"info","ts":1728763986.0204685,"caller":"server/log.go:50","msg":"request failed. failed while calling AI provider google: googleapi: Error 400: * GenerateContentRequest.generation_config.max_output_tokens: max_output_tokens must be positive.","duration_ms":4564,"method":"/schema.v1.ServerAnalyzerService/Analyze","request":"backend:\"google\" explain:true anonymize:true language:\"english\" max_concurrency:10 output:\"json\"","remote_addr":"10.244.0.27:58998","status_code":2}

Upon investigation, I discovered that maxOutputTokens is not being set in the K8sGPT manifest, which causes the c.maxTokens here in k8sGPT to default to 0, leading to the error.

Proposal:
Add an optional spec.ai.maxOutputToken field with a default value on kind: K8sGPT, which will then be passed to the schemav1.AnalyzeRequest.
If this proposal is acceptable, I would be happy to submit a PR to implement this change.