[Question]: I have two questions regarding its usage. May I ask if they can be resolved?
Opened this issue · 1 comments
Checklist
- I've searched for similar issues and couldn't find anything matching
- I've included steps to reproduce the behavior
Affected Components
- K8sGPT (CLI)
- K8sGPT Operator
K8sGPT Version
v0.1.6
Kubernetes Version
v1.28.11
Host OS and its Version
Rocky Linux 8.10
Steps to reproduce
1. When the error persists, the results are occasionally empty.
# k get results -n monitoring
NAME KIND BACKEND
defaultnginxdeployment26b7b6f9774b4wng Pod openai
# k get pod
NAME READY STATUS RESTARTS AGE
nginx-deployment2-6b7b6f9774-b4wng 0/1 ImagePullBackOff 0 2m21s
# k get results -n monitoring
No resources found in monitoring namespace.
# k get results -n monitoring
NAME KIND BACKEND
defaultnginxdeployment26b7b6f9774b4wng Pod openai
2. K8sGPT will print error logs, but it does not affect usage.
Created result defaultnginxdeployment56f9d4488hx589
Finished Reconciling k8sGPT
Creating new client for 10.108.197.164:8080
Connection established between 10.108.197.164:8080 and localhost with time out of 1 seconds.
Remote Address : 10.108.197.164:8080
K8sGPT address: 10.108.197.164:8080
Checking if defaultnginxdeployment56f9d4488hx589 is still relevant
Finished Reconciling k8sGPT with error: Operation cannot be fulfilled on results.core.k8sgpt.ai "defaultnginxdeployment56f9d4488hx589": the object has been modified; please apply your changes to the latest version and try again
2024-07-16T19:49:06Z ERROR Reconciler error {"controller": "k8sgpt", "controllerGroup": "core.k8sgpt.ai", "controllerKind": "K8sGPT", "K8sGPT": {"name":"k8sgpt-sample","namespace":"monitoring"}, "namespace": "monitoring", "name": "k8sgpt-sample", "reconcileID": "2b6fe54e-750e-4731-a364-d689a2665448", "error": "Operation cannot be fulfilled on results.core.k8sgpt.ai \"defaultnginxdeployment56f9d4488hx589\": the object has been modified; please apply your changes to the latest version and try again"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:324
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226
Creating new client for 10.108.197.164:8080
Connection established between 10.108.197.164:8080 and localhost with time out of 1 seconds.
Remote Address : 10.108.197.164:8080
K8sGPT address: 10.108.197.164:8080
Checking if defaultnginxdeployment56f9d4488hx589 is still relevant
Finished Reconciling k8sGPT
Expected behaviour
- Is it possible to stably display results when errors exist?
- How can I eliminate the error logs from K8sGPT?
Actual behaviour
No response
Additional Information
Configuration
apiVersion: core.k8sgpt.ai/v1alpha1
kind: K8sGPT
metadata:
name: k8sgpt-sample
namespace: monitoring
spec:
ai:
enabled: true
model: gpt-3.5-turbo
backend: openai
baseUrl: https://api.chatanywhere.tech
secret:
name: k8sgpt-sample-secret
key: openai-api-key
language: chinese
noCache: false
repository: ghcr.io/k8sgpt-ai/k8sgpt
version: v0.3.8
EOF
Hey @yangy30 , thanks for raising this.
I will also try to reproduce as it seems that we can handle the lifecycle of the result object better.
By the looks of it, it seems that the operator is updating the result object with an old revision or object number and then the operation is getting retried successfully in the next reconciliation loop.
I am still unsure how the result spec can be empty if the operation is not successful though.
I am wondering if you see any issues in the k8sgpt pod logs. The k8sgpt pod will make the inference call to the your AI backend which if it fails it might get an empty response back and write it to the results object.