terramate-io/terramate

[BUG] `terramate create --all-terragrunt` error with nested stacks

Closed this issue · 9 comments

Describe the bug

Terramate does not exclude implicit nested stacks from after field generating invalid stack dependency graph.

To Reproduce

Steps to reproduce the behavior:

  1. Have a code base with nested stacks
  2. Run command terramate create --all-terragrunt. Terramate adds parent in after list in stack.tm.hcl
  3. terramate list --run-order throws Invalid stack configuration > cycle detected

Expected behavior
Produce a valid dependency graph exclucing parent stack from Terramate dependency graph since its implicit.

Log Output
Add logs from Terramate to help debug your problem.

Environment

  • OS: MacOS Sonoma
  • OS Version 14.5
  • Git Version 2.45.2
  • Terramate Version 0.9.0

Additional context
I love Terramate, happy to help! Thanks for putting such a great tool out there!

Thanks for reporting this. We will look at this and get back to you asap!

Thanks!

Can't share code to reproduce easily, sorry.
As I am working on a mono repo with 1298 stacks, I can provide serious testing.

Example of invalid deps graph below

Child referencing parent:
Screenshot 2024-06-26 at 17 55 27

Parent referencing child:
image

@i4ki @snakster could you guys take a look at this please?

i4ki commented

Hey @ohmer

can you check if the branch in this PR fixes the issue? I managed to reproduce it in parent-child case but maybe it's not the same case you are getting.

To install the branch, you can do:

go install github.com/terramate-io/terramate/cmd/terramate@i4k-fix-tg-after-cycle

You can also export GOBIN=<place> to install it in a separate directory.

Let me know if it works.

Let me know if it works

Awesome, will do ASAP!

Looks better, but not a fix I believe. How I tested:

GOBIN=~/bin/ go install github.com/terramate-io/terramate/cmd/terramate@i4k-fix-tg-after-cycle
time ~/bin/terramate create --all-terragrunt
...
Created stack /support/global/default
Created stack /support/us-east-1/default
Error: failed to initialize Terragrunt modules
> validating stack fields: field after has duplicate "/prod/eu-west-3/data-biron/aurora" element

________________________________________________________
Executed in  509.13 secs    fish           external
   usr time  668.92 secs  222.00 micros  668.92 secs
   sys time  110.65 secs  881.00 micros  110.65 secs

> cycle detected: /preprod/eu-west-3/data -> /preprod/eu-west-3/data/kms -> /preprod/eu-west-3/data: checking node id "/preprod/eu-west-3/data"

FYI

find . -type f -name terragrunt.hcl -prune -not -path '*/.terragrunt-cache/*' | wc -l
    1202

On the cyclic dependency:

❯ grep -A1 'dependency "'  preprod/eu-west-3/data/terragrunt.hcl
dependency "web" {
  config_path = "../web"
--
dependency "ecom_private" {
  config_path = "../ecom-private"
--
dependency "erp" {
  config_path = "../erp"
--
dependency "support_resources" {
  config_path = "../../../support/eu-west-3/default/resources"
--
dependency "kong" {
  config_path = "../kong"
--
dependency "control" {
  config_path = "../../../support/eu-west-3/control/sns"
--
dependency "kms" {
  config_path = "./kms"

❯ cat preprod/eu-west-3/data/stack.tm.hcl
stack {
  name        = "data"
  description = "data"
  after       = ["/preprod/eu-west-3/data/kms", "/preprod/eu-west-3/ecom-private", "/preprod/eu-west-3/erp", "/preprod/eu-west-3/kong", "/preprod/eu-west-3/web", "/support/eu-west-3/control/sns", "/support/eu-west-3/default/resources"]
  id          = "ea79feb0-4c94-412d-af50-489fea414d60"
}

Since /preprod/eu-west-3/data/kms is nested under /preprod/eu-west-3/data, preprod/eu-west-3/data/stack.tm.hcl should not add any of its children to the after list as it contradicts the implicit before in the parent, correct?

i4ki commented

Hi @ohmer,

Sorry for the long delay to have this fixed.
I'm focused on fixing a few Terragrunt issues and I'm working on this one right now.

So the original reported problem is solved in the linked PR but the second one is actually not possible in Terramate.

Since /preprod/eu-west-3/data/kms is nested under /preprod/eu-west-3/data, preprod/eu-west-3/data/stack.tm.hcl should not add any of its children to the after list as it contradicts the implicit before in the parent, correct?

yes, Terramate has the filesystem ordering enabled by default and then you cannot have a stack that depends on stacks defined in its child directories.

So for the moment, we are gonna throw an error if this is seen during the create --all-terragrunt but we are discussing the introduction of a configuration for disabling the filesystem ordering or maybe configure its behavior.

ohmer commented

Hi @i4ki

Sorry for the long delay to have this fixed. I'm focused on fixing a few Terragrunt issues and I'm working on this one right now.

No worries at all, I understand the complexity of the matter and there is no pressuring issue on my end. I want to demonstrate the value of Terramate CLI vs Terragrunt but this is not going to be a switch that will be made anytime soon anyway.

So for the moment, we are gonna throw an error if this is seen during the create --all-terragrunt but we are discussing the introduction of a configuration for disabling the filesystem ordering or maybe configure its behavior.

An error feels like the right call to me. I think the most important outcome here is to ease adoption of Terramate. You pointed out an incompability between the tools in a certain use case. I would find it perfectly acceptable to keep it as an error long term as disabling the filesystem ordering feels quite intrusive. I'd rather keep Terramate simple and lean rather than bending it to Terragrunt's will. An error would indicate that the Terragrunt code base needs to be refactored to be convertible to Terramate. Sounds good enough to me! Some documentation update would help adopters.