hashicorp/terraform

Cannot use interpolations in lifecycle attributes

clstokes opened this issue ยท 57 comments

I was wanting to parameterize a boolean resource property with a variable, but it seems this isn't possible. I tried this with enable_dns_hostnames on aws_vpc too, so it doesn't seem limited to the prevent_destroy example below.

The example below fails if the variable's default value is unquoted or quoted.

Tested with v0.6.0 and v0.6.3.

variable "prevent_destroy" {
  default = true
}

resource "null_resource" "main" {
  lifecycle {
    prevent_destroy = "${var.prevent_destroy}"
  }
}
$ terraform apply -input=false
Error loading config: Error loading /Users/clstokes/Desktop/main.tf: Error parsing lifecycle for null_resource[main]: 1 error(s) decoding:

* cannot parse 'prevent_destroy' as bool: strconv.ParseBool: parsing "${var.prevent_destroy}": invalid syntax
$

Thanks for the report, @clstokes. The real issue here is that lifecycle blocks cannot contain interpolated values. This is because lifecycle modifications can change the shape of the graph, which makes handling computed values for them extra tricky. It's something we can theoretically do, but requires some thought and effort. Tagging as an enhancement.

Just wanted to chime in and share my use case for this feature.

I'm trying to set up a "staging" and "production" environments that share most of the infrastructure template via Terraform modules. Staging environment is unstable and I'd like to set prevent_destroy = false for all (or most) of its resources, while production environment should have prevent_destroy = true for (almost) everything.

I guess there's currently no way to express this idea in Terraform, which is unfortunate :(

Still, thanks for your awesome work!

Perhaps at the very least we could generalize the special-cased handling of variables within count to apply also to lifecycle blocks. That way variables could be used in here (as shown in the above example) but you wouldn't be able to interpolate outputs from resources and modules since they aren't know until apply time, as @phinze described.

Well, yup, this becomes quite painful when you try to utilize "prevent destroy" functionality in a shared module, that is just not supposed to be modified. I have a use-case similar to @laggyluke 's
+1

Alternatively, supporting prevent_destroy at the module level would be useful and might be easier than supporting interpolation?

+1

+1

+1

Yet another terraform behavior that makes no sense in production.

This is preventing me from implementing a pretty obvious deployment pattern: create new resources without destroying the previous ones (based on prevent_destroy) and then once the load balancer(s) have been reconfigured for the new systems, change prevent_destroy to false and "blow away" the older boxes.

This is a feature which should ideally come ASAP.

Are you familiar with create_before_destroy?
On May 23, 2016 6:11 PM, "Michael Crilly" notifications@github.com wrote:

This is preventing me from implementing a pretty obvious deployment
pattern: create new resources without destroying the previous ones (based
on prevent_destroy) and then once the load balancer(s) have been
reconfigured for the new systems, change prevent_destroy to false and
"blow away" the older boxes.

This is a feature which should ideally come ASAP.

โ€”
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#3116 (comment)

Hey @nathanielks!

I am indeed aware of that flag, however that's not going to cut it for most environments and CI/CD implementations.

The idea for me is being able to bring up new systems based on a new AMI (for example) being built, having the LB re-adjusted to use those new boxes, and then blowing away those older instances (using Terraform), isn't easily do-able whilst we can't control prevent_destroy from the CLI, via a variable.

Doing it all manually isn't an issue of course, but that's not the CI/CD solution we're looking for :-)

+1
Also looking to implement the use case @laggyluke provided

RobWC commented

+1

I like Terraform, but I feel like it fights me all the time and it makes me sad when things like this happen.

I love Terraform, and use it for all infrastructure deployments in AWS. I understand, you want to keep Terraform "simple to use", but we need to create very complex scripts to be able to reuse modules and avoid duplication. This actually defeats the purpose, because the "simple language" causes more complexity in our code. I would rather work with a full blown language (like the full Ruby in Chef) and ignore the instructions I don't need. To do DevOps work I am coming from the software development side, so I miss the power of a real programming language.

dhait commented

+1

I'm running into this as well.

This is not an enhancement - it should be standard functionality and is a bug.

We should be able to interpolate all variable types and over load them this is much more important than a graph feature which is a nice to have but probably much less used or needed.

@pixie79 I'm not a Terraform developer, but to the best of my knowledge resource graphs are a core aspect of terraforms dependency modeling, and a key reason why terraform is declarative and parse order independent.

@jonapich suggestion of supporting lifecycle attributes at the module level would satisfy my use cases as well.

+1

+1

+1

+1

Hi, I'm not sure if this is the right place to raise this, but it seems lifecycle doesn't respect overrides either. I thought, if I can't set a variable, maybe I could at least override it with a web_override.tf file where I set the prevent_destroy value to false. Is this part of the same thing?

Instead, I get this error:

the plan would destroy this resource, but it currently has lifecycle.prevent_destroy set to true. To avoid this error and continue with the plan, either disable lifecycle.prevent_destroy or adjust the scope of the plan using the -target flag.

Same-same as @laggyluke - staging/prod separation. No ability to 'promote' prevent_destroy flag across reusable modules. Did something wrong in dev/staging - no way you gonna change that flag across dozens modules and hundreds resources. Like, guys, seriously?

There was a comment by @phinze early on "how hard to keep up with the graph" and so on. Cool with that but probably that's a wrong focus. One way to see - yep, how hard to change architecture and internal stuff, the other way to look on that - maybe the current architecture does not address real-world scenarios anymore. So don't get stuck on something what does not work anymore. Time to re-engineer and make some dramatic changes. What's the focus is not up to me to decide yet the real world need may have a higher priority than the beautify of some legacy architecture.

Next, @pinterl mentioned some people come from DevOps area with complex need. Covering tons of scenarios, Terrafrom is still extremely 'static / declarative' comparing to a fully formed language such as Python, Ruby or PowerShell. That's an idea that constantly in my mind. Simple saying, why the heck one would re-invent a language? Really. Evidently, terrafrom introduced its own language to describe infrastructure. The language started as a static one. Step by step issues arose - the idea of having 'static' language met a reality, it seems to be. For-each, if-else, parametrization and all things that you would have in a normal language. Ended up with a something in between - not static anymore not fuly function language. No intellisense. Like, seriously? You what, remember all attributes and values or what? - always wanted to ask this one about intellisense. Whatever, thank you, still great tool :)

Outcome? Reinventing the XML-like mess with 'hard-to-debug, trace, maintain and author' basic constructions. Parametrization is a token based copy-paste of strings. Anyone happy with that? No one has typos and no one wastes time on trying to figure out that little typo in modules? Just checking, maybe @pinterl and me aren't following something great :)

Alright, some ideas to solve that. As a temporary workaround, terrafrom just pushes us to come up with our own DSL. Does not really matter which language it be, what matters is to have compile-time checks, no typos, control over the flow and parametrization.

For some people DSL may work better. With complex configs, say, Terrafrom + PowerShell DSC + PowerShell custom scripts it's better to have one DSL to rule them all.

Nevertheless, probably it would be really great to start thinking of some kind of DSL for terrafrom on top of Python, Ruby or PowerShell. Probably that would be an interesting move, and probably it would be just about right to get started. At least it would give us back the intellisense, avoid typos, give the control over the flow. Actually, reminds a bit how TypeScript made JS looks better. TypeTerrafrom language would be just about right :)

@avishnyakov

Declarative languages and graph dependency ordering are not new concepts. Puppet has used the same approach very successfully. Yes, there are some disadvantages with purely declarative resource management, but those limitations are outweighed by the huge upsides of idempotency, automatic ordering, and ease of development the code.

A declarative language is not simply a 'partially formed' language; it is a thing in and of itself. I would agree that Terraform has some limitations at this time; complex data structures are difficult to manage, the language doesn't always handle conditional logic in an elegant way, there are some teething problems with the ordering model, and the obtuseness of error handling. But these problems are being resolve, and by no means justify throwing out the declarative model.

If you want to use Python or Ruby to build templates, it's trivial to do so thanks to the JSON syntax support. This is the approach we've taken on our projects, and it has been extremely successful.

@cbarbour sure, agree on that, makes sense.
Would you share your experience on the approach you guys took with Python or Ruby? Do you generate terraform templates from your own DSL? What are the reasons for that?

Sure.

In our case, we were using Terraform to create self-service AWS clusters for research computing projects.

Our primary set of templates focused on Compute resources. The compute module included several different sets of templates allowing users to create a cluster containing a master node, hourly nodes, spot nodes, etc. Variables are used to pass instance counts, instance type, AMI etc. If a user didn't want spot instances, they could simply set the instance count to 0.

Storage was separated into another module.

We wrote a front end Python script for configuring a cluster. That script would write out a root module declaring instances of the compute and storage modules based on the user's input. This would be written out as a JSON template.

Using modules allowed us to define a clean interface between Python and Terraform. Points of interface were all input variables and output data from terraform. This allowed us to develop our Python code independently from the Terraform templates; major changes could be made to either, with minimal changes to that interface.

Separating storage into it's own module made it relatively safe to remove a compute cluster without deleting the data; simply remove the compute module declaration, but leave storage in place. That storage could later be attached to another compute module.

Basically:

Python > JSON Root Terraform Module > compute/storage implementation modules.

Great stuff. Doing same-same with Azure/SharePoint. Similar approach with module separation - some for Azure bootstrapping, some for servers bootstrapping in different configs, some for domain controllers, some for SQL and SharePoint and so on. Exactly that's why we stumbled upon various limitations while combining modules together - lacking parameters and cleaner control over parametrization and module combinations. Heading same direction with PowerShell:

PowerShell DSL > Terraform Module > bunch of the modules

+1

@avishnyakov , @cbarbour don't we lose here the main idea of declarative infrastructure description? If we need to write scripts to generate terraform templates, why we need it at all? We can just create classes using cloud provider SDK and reuse them in scripts.

Maybe we should take a look on Chef/Puppet approach: DSL + libraries for non-common logic?

Any progress on the issue?

+1

+1

oli-g commented

Hi @mitchellh, can you say if this enhancement has some possibility to be added to a 0.9 future release? Thank you and keep the good work!

+1 I have an RDS module and I want the default marked as prevent_destroy so we don't lose data. Unfortunately this means that if I ever do want to destroy that DB I have to either modify that module (not ideal) or I need to go into the AWS console and do it. Please add this as a feature!

+1

+1

zzzap commented

+1

As a dirty hack workaround for this, I've made this to make lifecycle work as a variable:

variable "cluster" {
  type = "map"
  default = {
    protect_asg = false
  }
}

resource "aws_autoscaling_group" "asg" {
  ...
}
resource "random_id" "protector" {
  count = "${ lookup(var.cluster, "protect_asg", false) ? 1 : 0 }"
  byte_length = 8
  keepers = {
    asg_id = "${ aws_autoscaling_group.asg.id }"
  }
  lifecycle {
    prevent_destroy = true
  }
}

So, if I want to protect an ASG, then I would set that variable, and random would generate itself with the data from that ASG's ID. Random can not change itself without a delete/recreate, and would trigger the prevent_destroy.

If I don't want to protect the resource, then the count = 0 for the random thing, and it doesn't get created.

The downside to this method is that it's pretty tough to un-protect a resource :(.

To provide a bit more granularity, the following might help:

Inside the module:

resource "aws_autoscaling_group" "asg" {
  ...
}

output "protector" {
  value = "${aws_autoscaling_group.asg.id}"
}

And in the module caller:

module "my_module" {
  ...
}

resource "random_id" "protector" {
  count = "${ var.needs_protection ? 1 : 0 }"
  byte_length = 8
  keepers = {
    protector = "${ module.my_module.protector }"
  }
  lifecycle {
    prevent_destroy = true
  }
}

This second way lets you turn off the prevent_destroy for the single resource that calls the module, rather than for all modules. To override and make a change, change the prevent_destroy to false first.

2nd anniversary of that "enhancement" comes in ~4 months. Could maintainers please shed some light if this issue going to be fixed or not. It's hard to circumvent these limitations in module development thus decision should be made: should we do enormous code duplications going forward or not.

Thank you!

This is another issue that rolls up into the broader proposal #4149, which gives us a building-block we need to allow the use of possibly-dynamic data in attributes that influence graph construction. It is a goal to get that (or something like it) done eventually, but it's a pretty big change and so it's had trouble rising to the top of the list to work on. Some internal refactoring done in Terraform 0.8 and 0.9 has put us in a much better spot to get that done, but I'm afraid it's not a short-term goal since the whole team is allocated to other work at this time.

I understand the frustration caused by this limitation, and I would very much like to address it eventually!

@apparentlymart

Thanks for the update.

Hope this issue can be fixed before release v1.0

+1 to this feature. Being able to create modules with with the flexibility of flags to conditionally control create_before_destroy or similar attributes would go a long way to keep modules as simple as possible. In my current use case, I'd love to pass a flag to this community module which would conditionally enable rolling releases.

Hi everyone!

Thanks for the great discussion here. Since it seems to have now reached the point of just attracting "+1" comments I'm going to lock it to reduce the noise for the many people who are watching this issue. We'll come back and unlock it when we get the point of being ready to start work here (once the prerequisite work has been done.)

crw commented

Per #33031 (comment), this issue as written no longer describes a problem that is fully relevant in modern Terraform (circa 2023). I will quote the full comment here for posterity:

Thanks for compiling these issues. The original issue #3116 is probably no longer relevant as written -- interpolation is possible and used within lifecycle, but how it's used depends on each individual feature of a resource's lifecycle. Asking for interpolation for lifecycle in general is not going to effectively change anything.

It seems you are mostly concerned with prevent_destroy specifically, and one reason that has not changed is because it cannot do what the majority of users want it to do regardless of the ability to interpolate the value. Controlling how something is destroyed from the config doesn't work when that configuration can be removed entirely. If better control over the ability to destroy a resource is what you're after, #22544 covers this (and already shows a lot of support), but I think #24658 is probably where users should indicate their support for a more useful feature in general (or with the individual provider issues where that is applicable, since absolute control can only ever be applied outside of Terraform).

Since there is no specific bug to address here, I'm going to close the issue. If you have other concerns about enhancements which are not addressed by existing issues already, we can handle those individually in separate issues.

To follow up with the modern ancestors of this issue, please see the following:

References