Basic container-kill is not working with YAML created through UI in 3.4.0
Opened this issue · 2 comments
What happened:
Trying to get a basic container kill to work without success with Yaml created from 3.4.0 UI. Trying to save a fixed Yaml will not let me save it.
Logs:
{"mainLogs":"W0304 02:03:16.394831 1 client_config.go:552] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.\n2024/03/04 02:03:16 Error Creating Resource : ChaosEngine.litmuschaos.io 'container-kill-mcilxqr5' is invalid: [spec.experiments[0].spec.probe[0].runProperties.interval: Invalid value: 'string': spec.experiments[0].spec.probe[0].runProperties.interval in body must be of type integer: 'string', spec.experiments[0].spec.probe[0].runProperties.probePollingInterval: Invalid value: 'string': spec.experiments[0].spec.probe[0].runProperties.probePollingInterval in body must be of type integer: 'string', spec.experiments[0].spec.probe[0].runProperties.probeTimeout: Invalid value: 'string': spec.experiments[0].spec.probe[0].runProperties.probeTimeout in body must be of type integer: 'string']\n"}
What you expected to happen:
It should kill container
Where can this issue be corrected? (optional)
UI or backend or both.
How to reproduce it (as minimally and precisely as possible):
Workflow generated from UI using cmd probe
kind: Workflow
apiVersion: argoproj.io/v1alpha1
metadata:
name: test-kill
namespace: application
labels:
infra_id: b15c80b3-311a-4e1e-b071-42201aa4f765
revision_id: 365a2fac-b4c7-48e9-ae22-b5af0701d678
workflow_id: 7db33825-6073-478d-b109-7edb45551cdb
workflows.argoproj.io/controller-instanceid: b15c80b3-311a-4e1e-b071-42201aa4f765
spec:
templates:
- name: test-kill
inputs: {}
outputs: {}
metadata: {}
steps:
- - name: install-chaos-faults
template: install-chaos-faults
arguments: {}
- - name: container-kill-mci
template: container-kill-mci
arguments: {}
- - name: cleanup-chaos-resources
template: cleanup-chaos-resources
arguments: {}
- name: install-chaos-faults
inputs:
artifacts:
- name: container-kill-mci
path: /tmp/container-kill-mci.yaml
raw:
data: >
apiVersion: litmuschaos.io/v1alpha1
description:
message: |
Kills a container belonging to an application pod
kind: ChaosExperiment
metadata:
name: container-kill
labels:
name: container-kill
app.kubernetes.io/part-of: litmus
app.kubernetes.io/component: chaosexperiment
app.kubernetes.io/version: ci
spec:
definition:
scope: Namespaced
permissions:
- apiGroups:
- ""
resources:
- pods
verbs:
- create
- delete
- get
- list
- patch
- update
- deletecollection
- apiGroups:
- ""
resources:
- events
verbs:
- create
- get
- list
- patch
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- list
- apiGroups:
- ""
resources:
- pods/log
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- pods/exec
verbs:
- get
- list
- create
- apiGroups:
- apps
resources:
- deployments
- statefulsets
- replicasets
- daemonsets
verbs:
- list
- get
- apiGroups:
- apps.openshift.io
resources:
- deploymentconfigs
verbs:
- list
- get
- apiGroups:
- ""
resources:
- replicationcontrollers
verbs:
- get
- list
- apiGroups:
- argoproj.io
resources:
- rollouts
verbs:
- list
- get
- apiGroups:
- batch
resources:
- jobs
verbs:
- create
- list
- get
- delete
- deletecollection
- apiGroups:
- litmuschaos.io
resources:
- chaosengines
- chaosexperiments
- chaosresults
verbs:
- create
- list
- get
- patch
- update
- delete
image: xxx/litmuschaos/go-runner:latest
imagePullPolicy: Always
args:
- -c
- ./experiments -name container-kill
command:
- /bin/bash
env:
- name: TARGET_CONTAINER
value: ""
- name: RAMP_TIME
value: ""
- name: TARGET_PODS
value: ""
- name: CHAOS_INTERVAL
value: "10"
- name: SIGNAL
value: SIGKILL
- name: SOCKET_PATH
value: /run/containerd/containerd.sock
- name: CONTAINER_RUNTIME
value: containerd
- name: TOTAL_CHAOS_DURATION
value: "20"
- name: PODS_AFFECTED_PERC
value: ""
- name: NODE_LABEL
value: ""
- name: DEFAULT_HEALTH_CHECK
value: "false"
- name: LIB_IMAGE
value: xxx/litmuschaos/go-runner:latest
- name: SEQUENCE
value: parallel
labels:
name: container-kill
app.kubernetes.io/part-of: litmus
app.kubernetes.io/component: experiment-job
app.kubernetes.io/runtime-api-usage: "true"
app.kubernetes.io/version: ci
outputs: {}
metadata: {}
container:
name: ""
image: xxx/litmuschaos/k8s:2.11.0
command:
- sh
- -c
args:
- kubectl apply -f /tmp/ -n {{workflow.parameters.adminModeNamespace}}
&& sleep 30
resources: {}
- name: cleanup-chaos-resources
inputs: {}
outputs: {}
metadata: {}
container:
name: ""
image: xxx/litmuschaos/k8s:2.11.0
command:
- sh
- -c
args:
- kubectl delete chaosengine -l workflow_run_id={{workflow.uid}} -n
{{workflow.parameters.adminModeNamespace}}
resources: {}
- name: container-kill-mci
inputs:
artifacts:
- name: container-kill-mci
path: /tmp/chaosengine-container-kill-mci.yaml
raw:
data: >
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
namespace: "{{workflow.parameters.adminModeNamespace}}"
labels:
workflow_run_id: "{{ workflow.uid }}"
workflow_name: test-kill
annotations:
probeRef: '[{"name":"cmd","mode":"OnChaos"}]'
generateName: container-kill-mci
spec:
engineState: active
appinfo:
appns: application
applabel: app.kubernetes.io/managed-by=solace-pubsubplus-operator
appkind: statefulset
chaosServiceAccount: litmus-admin
experiments:
- name: container-kill
spec:
components:
env:
- name: TARGET_CONTAINER
value: ""
- name: RAMP_TIME
value: ""
- name: TARGET_PODS
value: ""
- name: CHAOS_INTERVAL
value: "10"
- name: SIGNAL
value: SIGKILL
- name: SOCKET_PATH
value: /run/containerd/containerd.sock
- name: CONTAINER_RUNTIME
value: containerd
- name: TOTAL_CHAOS_DURATION
value: "20"
- name: PODS_AFFECTED_PERC
value: ""
- name: NODE_LABEL
value: ""
- name: DEFAULT_HEALTH_CHECK
value: "false"
- name: LIB_IMAGE
value: xxx/litmuschaos/go-runner:latest
- name: SEQUENCE
value: parallel
outputs: {}
metadata:
labels:
weight: "10"
container:
name: ""
image: xxx/litmuschaos/litmus-checker:2.11.0
args:
- -file=/tmp/chaosengine-container-kill-mci.yaml
- -saveName=/tmp/engine-name
resources: {}
entrypoint: test-kill
arguments:
parameters:
- name: adminModeNamespace
value: application
serviceAccountName: argo-chaos
podGC:
strategy: OnWorkflowCompletion
securityContext:
runAsUser: 1000
runAsNonRoot: true
status: {}
Anything else we need to know?:
Seems it is a 3.4.0 UI wrong instruction which is misleading. When enabling chaos it asked to apply: https://raw.githubusercontent.com/litmuschaos/litmus/master/mkdocs/docs/3.0.0-beta10/litmus-portal-crds-3.0.0-beta10.yml
This is old CRD. It should ask to apply something like: https://github.com/litmuschaos/litmus/blob/f86aad1328bcc580c97cc1ab57ee217880e7fb08/mkdocs/docs/3.4.0/litmus-portal-crds-3.4.0.yml
@Saranya-jena @hrishavjha please take a look at this