litmuschaos/litmus

Resilience Probe evaluation fail when source is configured

rogeriofbrito opened this issue · 0 comments

What happened: When a cmd probe with a source is configured, the probe evaluation fails.

What you expected to happen: When a cmd probe with a source mode is evaluated, the command is executed inside specified container in source and pass or fail (according the command output).

Where can this issue be corrected? (optional): probably in Chaos Engine creation.

How to reproduce it (as minimally and precisely as possible):

  1. Create a new resilience probe of type CMD

  2. In Properties section, set:

    • Timeout: 60s
    • Interval: 3s
Screenshot 2024-06-07 at 15 40 36
  1. In Probe Details section, set:
    • Command: k6 version
    • Data Comparison
      • Type: String
      • Comparison Criteria: contains
      • Value: 0.51.0
    • Source:
      image: grafana/k6:0.51.0
      imagePullPolicy: Always
      privileged: true
      hostNetwork: false
probe_created_chaoscenter
  1. Run any experiment with this probe attached. In my scenario, I ran a pod-delete experiment.

Expected probe saved in MongoDB:

probe_saved_in_mongo_db

Expected result after run experiment:

Screenshot 2024-06-07 at 15 44 16

Expected error in pod experiment runner logs:

time="2024-06-05T15:22:40Z" level=info msg="Experiment Name: pod-delete"
time="2024-06-05T15:22:40Z" level=info msg="[PreReq]: Getting the ENV for the pod-delete experiment"
time="2024-06-05T15:22:42Z" level=info msg="[PreReq]: Updating the chaos result of pod-delete experiment (SOT)"
time="2024-06-05T15:22:44Z" level=info msg="The application information is as follows" Targets="[{namespace: order-api-app, kind: deployment, labels: [app=order-api-app]}]" Chaos Duration=15
time="2024-06-05T15:22:46Z" level=info msg="[Probe]: The cmd probe information is as follows" Mode=Edge Phase=PreChaos Name=smoke-test-order-api-native Command="k6 version" Comparator="{string contains v0.51.0}" Source="<nil>" Run Properties="{60s 3s 0 0  0    false}"
time="2024-06-05T15:22:46Z" level=info msg="name: smoke-test-order-api-native, err: {\"errorCode\":\"CMD_PROBE_ERROR\",\"reason\":\"unable to run command: \",\"target\":\"{name: smoke-test-order-api-native}\"}"
time="2024-06-05T15:22:46Z" level=error msg="Probe Failed, err: probes failed\n --- at /litmus-go/pkg/probe/probe.go:339 (execute) ---\nCaused by: {\"errorCode\":\"CMD_PROBE_ERROR\",\"reason\":\"unable to run command: \",\"target\":\"{name: smoke-test-order-api-native}\"}"

Anything else we need to know?:

  • In this thread there is a discussion with more details about the problem.

  • Probabaly the problem is ocurring due to an error when the Chaos Engine is created without the probe source information. The image bellow shows the Chaos Engine created to a pod-delete experiment with a probe configured with a source. Note that source information is missing.

probe_configured_in_chaos_engine_without_source
  • All tests was execured in Litmus 3.6.0 and Litmus Agent 3.6.0, both installed using helm charts in a local kind k8s cluster. The docker environment was launched by rancher-desktop/qemu on Macbook M2 pro.