[FEATURE]Handle Failure on Multi Stack Runs

Question

[FEATURE]Handle Failure on Multi Stack Runs

Closed this issue 2 months ago · 2 comments

Is your feature request related to a problem? Please describe.
When we run multiple stacks or lets say we have changes on all the stacks, then failure on one stack would cause fail for all of them and we end up running everything again.
Situation : All our stacks are using a module v1 and now we have a new release to the module as v2. Now we run the terramate command to run all changed which will basically be running all stacks
Issue: If any one stack fails due to either intermittent error or any network/environment issue then your entire terramate run will fail and you end up running the entire stack again even though other stacks were successful you will end up running plan with no change.

Describe the solution you'd like
Smart enough to ignore if the run on stack was successful then don't re-run. IS there a way to track the success and make it known to terramate via stacks.hcl file ?

Describe alternatives you've considered
No alternative, its causing us spending lots of time re-running the whole thing again and again.

Additional context

Answer 1 · 2024-10-28T16:57:07.000Z

Hi @ramizraza504, this exists as a feature in Terramate Cloud. E.g. you can rerun or trigger stacks by a specific status in the CLI when authenticated and synced with Terramate Cloud.

For example:

Run `terraform apply` on all failed stacks

terramate run --status failed -- terraform apply

Create a trigger for all failed stacks

terramate experimental trigger --status failed

We also have documentation available for this at https://terramate.io/docs/cli/reference/cmdline/run#running-a-command-on-stacks-with-specific-cloud-status

Since this is an already existing feature in Terramate Cloud, I will close this issue for now. Please feel free to follow up with questions if any occur!

Answer 2 · 2024-10-28T17:13:38.000Z

@soerenmartius Thanks for quick response on this query. Is this only available via Terramate cloud ? Is this feature not part of open source cli version without terramate cloud ?

Also in my case I have been trying to run lets say 50+ stacks based on changes made to a terragrunt.hcl file under each stacks. My pipeline runs terramate run --changed command which identifies changes to all stacks and run terragrunt plan/apply on all of them. If one fails now how to handle the failure in this scenario as its based on git commit changes .

Run terraform apply on all failed stacks

Create a trigger for all failed stacks

Run `terraform apply` on all failed stacks