litmuschaos/litmus

Failed to get the instance tag at "EC2 Stop By Tag"

Closed this issue · 4 comments

What happened:

Fault Summary:
TARGET_SELECTION_ERROR
{"errorCode":"TARGET_SELECTION_ERROR","phase":"PreChaos","reason":"failed to get the instance tag, invalid instance tag","target":"{EC2 Instance Tag: , Region: ap-northeast-2}"}

While running an AWS experiment using litmus helm v3.9.0, I encountered an error when executing the ec2-stop-by-tag. Despite providing a value for the EC2_INSTANCE_TAG field, the experiment failed because an empty string was passed to it.

cc. @namkyu1999

What you expected to happen:
I expected the ec2-stop-by-tag experiment to stop the EC2 instance based on the provided tag value.

Where can this issue be corrected? (optional)

How to reproduce it (as minimally and precisely as possible):

  1. Install Litmus using Helm with the following command:
helm install chaos litmuschaos/litmus --namespace=litmus --create-namespace --set portal.frontend.service.type=NodePort --set mongodb.image.registry=ghcr.io/zcube --set mongodb.image.repository=bitnami-compat/mongodb --set mongodb.image.tag=6.0.5
  1. Execute the ec2-stop-by-tag with a valid EC2_INSTANCE_TAG value(e.g., stack:test).
  2. Observe that the command is passed an empty string for the EC2_INSTANCE_TAG field, causing the experiment to fail.

Anything else we need to know?:

  • I have verified that the ec2-stop-by-id is working as expected, indicating that the issue is specific to the ec2-stop-by-tag.
  • I have tried modifying the manifest to enclose the tag value in double quotes (e.g., "stack:test"), but the issue persists.

Test Environment:

  • minikube v1.33.0
  • litmus helm v3.9.0

The fault configuration has the EC2_INSTANCE_TAG set correctly. However, during the experiment, EC2_INSTANCE_TAG is not passed.
tune_fault

The other experiment using tags, ebs-loss-by-tag works fine.

The error occurs because the GetInstanceList method is passed an empty string as the instanceTag parameter.

https://github.com/litmuschaos/litmus-go/blob/master/pkg/cloud/aws/ec2/ec2-operations.go#L141-L144

The error occurs because the GetInstanceList method is passed an empty string as the instanceTag parameter.

This error occurred because the environment variable has been renamed, which has caused the runner to be unable to retrieve the correct value. Rather than correcting the fault configuration, I think it would be better to update the codebase to use EC2_INSTANCE_TAG like a normal fault configuration.

I'm gonna work on this issue.