Changing root_parent_id results in Management Groups not being deployed
lpmulligan opened this issue ยท 10 comments
Community Note
- Please vote on this issue by adding a ๐ reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Versions
terraform: 1.0.8
azure provider: 2.80.0
module: 1.0.0
Description
I'm trying to deploy an Enterprise Scale LZ deployment within an existing EA structure. I've been assigned owner to a Management Group 3 levels down from the Tenant Root Group. I would like to deploy a stock ES LZ within that management group. Changing root_parent_id from:
root_parent_id = data.azurerm_client_config.core.tenant_id
root_id = "myorg"
root_name = "My Organization"
deploy_demo_landing_zones = true
to
root_parent_id = "ELZDemo"
root_id = "myorg"
root_name = "My Organization"
deploy_demo_landing_zones = true
Results in all the
resource "azurerm_management_group" "level_1", resource "azurerm_management_group" "level_2", etc resources being skipped.
Manually creating the Management Groups under "ELZDemo" by hand allows the core management terraform apply
to run successfully.
Steps to Reproduce
- Set
root_parent_id = data.azurerm_client_config.core.tenant_id
in main.tf - Run
terraform plan
- You get MG groups created with 191 resources
- Set
root_parent_id = "ELZDemo"
- You get no MG groups with 180 resources.
Screenshots
N/A
Additional context
N/A
Hey @lpmulligan,
Thanks for raising an issue ๐
I have just tested this myself and it worked okay for me:
As you can see highlighted in orange.
My TF file looked like this:
# We strongly recommend using the required_providers block to set the
# Azure Provider source and version being used.
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">= 2.77.0"
}
}
}
provider "azurerm" {
features {}
}
# You can use the azurerm_client_config data resource to dynamically
# extract connection settings from the provider configuration.
data "azurerm_client_config" "core" {}
# Call the caf-enterprise-scale module directly from the Terraform Registry
# pinning to the latest version
module "enterprise_scale" {
source = "Azure/caf-enterprise-scale/azurerm"
version = "1.0.0"
providers = {
azurerm = azurerm
azurerm.connectivity = azurerm
azurerm.management = azurerm
}
root_parent_id = "jt-102021-landingzones"
root_id = "190"
root_name = "GH Issue 190 Test"
deploy_demo_landing_zones = true
}
Are you able to share your TF file/s and a simple diagram of your Management Group hierarchy? (please redact any sensitive information)
Look forward to hearing from you
Thanks
Jack
@jtracey93. Thanks for double-checking this for me.
I did try this deployment in two separate Azure tenants to make sure I didn't miss something. The second tenant being my MSDN one. I got the same results; hence, I opened up this issue.
A few things:
-
I reran things again in my MSDN subscription and things did work as expected. I think I may have forgotten
to adddeploy_demo_landing_zones = true
during my testing. Sorry about. Addingdeploy_demo_landing_zones = true
may not be the reason it worked. See #4 below. -
In my EA subscription, things continue to not work.
You'll notice that I don't have access to all the MGs in the tree like you do...nor like I do in my MSDN sub!
- My code is similar to yours. Here's the module block
module "enterprise_scale" {
source = "Azure/caf-enterprise-scale/azurerm"
version = "1.0.0"
providers = {
azurerm = azurerm
azurerm.connectivity = azurerm
azurerm.management = azurerm
}
root_parent_id = "ELZDemo"
root_id = "eslz"
root_name = "Enterprise-Scale LZ"
deploy_demo_landing_zones = true
}
- Some interesting developments
- Setting my module to this...exactly like yours
module "enterprise_scale" {
source = "Azure/caf-enterprise-scale/azurerm"
version = "1.0.0"
providers = {
azurerm = azurerm
azurerm.connectivity = azurerm
azurerm.management = azurerm
}
root_parent_id = "jt-102021-landingzones"
root_id = "190"
root_name = "GH Issue 190 Test"
deploy_demo_landing_zones = true
}
Results in the TF plan saying it's going to create the MG groups as expected even though the parent MG "jt-102021-landingzones" doesn't exist.
However, if I change root_parent_id = "jt-102021-landingzones"
to root_parent_id = "ELZDemo"
the TF plan skips the creation of the MGs.
- Sorry for waiting until the end, but I think I've narrowed down the problem.
- If I set
root_parent_id = "ELZDemo", it doesn't work. However, if I set
root_parent_id = "eslz-demo" it does. So I'm thinking there's something up with how the module identifies the root_parent_id MG and there's no hyphen in there.
Sincerely,
Larry
Hey @lpmulligan,
Thanks for the excellent write up, really useful. Let me review and investigate further and we'll come back to you ๐
Cheers
Jack
Quick update have tested the hyphen theory out further:
Code:
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">= 2.77.0"
}
}
}
provider "azurerm" {
features {}
}
module "enterprise_scale" {
source = "Azure/caf-enterprise-scale/azurerm"
version = "1.0.0"
providers = {
azurerm = azurerm
azurerm.connectivity = azurerm
azurerm.management = azurerm
}
root_parent_id = "tf190level3"
root_id = "eslz"
root_name = "Enterprise-Scale LZ"
deploy_demo_landing_zones = true
}
Also just tried the same hierarchy above with an SPN that only has RBAC as below:
- Reader on a subscription (used in provider block) outside of "tf190" hierarchy
- Owner on "tf190level3"
Still got a correct plan of 191 resources
Hey @lpmulligan,
So from digging through the module a little deeper tonight we think we may of found the reason why this is happening.
These 2 lines in locals.management_group.tf
seem to be of importance for this scenario and its actually looking like its due to the casing used for the input of root_parent_id
:
On the line, from the first link, we are building all of the maps for the Management Groups to be created by the module, but we are evaluating if the parent_management_group_id
(that we create above in the same file, ready for deployment later) is the same (case sensitive as its Terraform) as root_parent_id
.
But in the second link we have some logic that says if the parent_management_group_id
is longer than 0 characters in length, which it is, then lowercase the string and replace any non-alphanumeric characters with -
's.
So when combined your root_parent_id
of ELZDemo
turns into elzdemo
and when they are compared, remembering they are case sensitive, they don't match. Which then causes the if
statement (first link) to kick in and make the map empty for level_1, which then has a cascading effect down through the other levels.
So what are we going to do...?
Good question, we need to do some further testing of some potential fixes and we also need to just check "why" we implemented the logic to start with an we will then make a PR to fix.
Give us a few days and we will come back to you ๐
Hope that all makes sense, and feel free to reach out with any further questions or queries
Thanks
Jack
(also thanks to @krowlandson for the late night discussion to review my thinking ๐)
Well shoot. I thought I had it. I think it's something in my environment not related to the code. I just had a co-worker change the MG ID from ELZDemo to eslz-demo and it works. However, your test shows that my hypothesis was wrong which is fine. Something's up with my setup...I don't know where but feel free to close this issue. Thanks for your assistance.
You certainly led us down the right path, so thank you ๐
We will leave this open so we can track the bug fix ๐
If you have another issue, feel free to open a another issue on the repo. Happy to help ๐
Well shoot. I thought I had it. I think it's something in my environment not related to the code. I just had a co-worker change the MG ID from ELZDemo to eslz-demo and it works. However, your test shows that my hypothesis was wrong which is fine. Something's up with my setup...I don't know where but feel free to close this issue. Thanks for your assistance.
@lpmulligan - just to add that it may have worked if you just set root_parent_id = "elzdemo"
(i.e. all lowercase).
As per @jtracey93's follow up, we have found the point where this is going wrong for you in the code and will work on getting this resolved in a future release. We first need to work out the impact of the fix though, as we may cause unexpected changes if someone has inadvertently used the incorrect casing somewhere within their config and the following is automatically "correcting" this:
We need to be careful to understand whether changing this will cause deployments with incorrect config to fail, or to redeploy Management Groups.