One or more users named in the policy do not belong to a permitted customer (step 0-bootstrap)
lpezet opened this issue · 9 comments
TL;DR
I'm going through https://github.com/terraform-google-modules/terraform-example-foundation/blob/master/0-bootstrap/README-GitHub.md.
When running either step 21 or 31 (if letting the pipeline create the groups), the following error can (did) happen (I did obfuscate values, using example.com and fake org id):
Error: Error applying IAM policy for organization "1234567890": Error setting IAM policy for organization "1234567890": googleapi: Error 400: One or more users named in the policy do not belong to a permitted customer.
Details:
[
{
"@type": "type.googleapis.com/google.rpc.PreconditionFailure",
"violations": [
{
"description": "User gcp-organization-admins@example.com is not in permitted organization.",
"subject": "orgpolicy:organizations/1234567890?configvalue=gcp-organization-admins%example.com",
"type": "constraints/iam.allowedPolicyMemberDomains"
}
]
}
]
, failedPrecondition
with module.seed_bootstrap.google_organization_iam_member.org_admin_serviceusage_consumer[0],
on .terraform/modules/seed_bootstrap/main.tf line 252, in resource "google_organization_iam_member" "org_admin_serviceusage_consumer":
252: resource "google_organization_iam_member" "org_admin_serviceusage_consumer" {
Error: Error applying IAM policy for storage bucket "b/***": Error setting IAM policy for storage bucket "b/***": googleapi: Error 400: Group gcp-organization-admins@example.com does not exist., invalid
with module.seed_bootstrap.google_storage_bucket_iam_member.orgadmins_state_iam[0],
on .terraform/modules/seed_bootstrap/main.tf line 276, in resource "google_storage_bucket_iam_member" "orgadmins_state_iam":
276: resource "google_storage_bucket_iam_member" "orgadmins_state_iam" {
Expected behavior
Running terraform apply
only once.
Observed behavior
Going through https://github.com/terraform-google-modules/terraform-example-foundation/blob/master/0-bootstrap/README-GitHub.md, I had this issue at step 23. Run terraform apply.
I re-ran it and it went fine.
I encountered issue #1206 and after running through fix #1206 (comment), step 31. The Pull request will trigger...
gave the same error.
Terraform Configuration
# terraform.tfvars
org_id = "1234567890"
billing_account = "000000-000000-000000"
groups = {
create_required_groups = true # Change to true to create the required_groups
create_optional_groups = false # Change to true to create the optional_groups
billing_project = "project-1234" # Fill to create required or optional groups
required_groups = {
group_org_admins = "gcp-organization-admins@example.com"
group_billing_admins = "gcp-billing-admins@example.com"
billing_data_users = "gcp-billing-data@example.com"
audit_data_users = "gcp-audit-data@example.com"
}
}
default_region = "us-central1"
default_region_2 = "us-west1"
default_region_gcs = "US"
default_region_kms = "us"
/* ----------------------------------------
Specific to github_bootstrap
---------------------------------------- */
gh_repos = {
owner = "someone",
bootstrap = "example-bootstrap",
organization = "example-org",
environments = "example-envs",
networks = "example-nets",
projects = "example-projs",
}
Terraform Version
Terraform v1.8.3
on linux_amd64
+ provider registry.terraform.io/hashicorp/google v5.34.0
+ provider registry.terraform.io/hashicorp/google-beta v5.34.0
+ provider registry.terraform.io/hashicorp/null v3.2.2
+ provider registry.terraform.io/hashicorp/random v3.6.2
+ provider registry.terraform.io/hashicorp/time v0.11.2
+ provider registry.terraform.io/integrations/github v5.34.0
Additional information
I believe the fix (I'll propose one) is for module.seed_bootstrap
to depend on module.required_group
, so that groups are created first before terraform-google-modules/bootstrap/google
(module.seed_bootstrap) execute the google_organization_iam_member
resources.
#1273 has been merged, but from reading the details of #1206 I'm not certain whether this solves the issue. (From 1206, it looks like there are inconsistent permissions based on whether groups are created manually on the admin console or as part of the automation using service accounts).
@lpezet can you please confirm whether you're still seeing the issue after this change?
@eeaton The behavior I mentioned in #1273 was happening before and after implementing the fix from #1206.
I'll re-run it as soon as I get the chance (been busy) but if I can confirm my fix does address the issue, I'd love to find a way to add that in the tests (is it possible to "delay"/slow down group creation before the seed project configuration?).
@eeaton It's proving difficult to destroy everything 0-bootstrap created. I only provided the minimum (org_id, billing_account, groups object, default_region* and gh_repos information in terraform.tfvars
) and I now realize I should have looked at bucket_tfstate_kms_force_destroy
and bucket_force_destroy
variables as well to make it possible to redo this whole process again and again (something I wanted to do from the beginning).
Now running into issues like:
│ Error: error loading state: Failed to open state file at gs://bkt-prj-b-seed-tfstate-XXXX/terraform/bootstrap/state/default.tfstate: googleapi: got HTTP response code 403 with body: <?xml version='1.0' encoding='UTF-8'?><Error><Code>AccessDenied</Code><Message>Permission denied on Cloud KMS key. Please ensure that your Cloud Storage service account has been authorized to use this key.</Message></Error>
If you have any tips on what to specify/do at the beginning to be able to go through 0-bootstrap and then destroy everything cleanly to repeat, please let me know so I can use that next time.
I'd love to find a way to add that in the tests (is it possible to "delay"/slow down group creation before the seed project configuration?).
In general yes, and we do have a number of sleep timers and retry logic where resources aren't available to reference on GCP immediately after terraform apply commands. However, I subsequently added some details to #1206 that identifies the root cause as a permissions issue, so I don't think adding more sleep timers would make a difference here.
│ Error: error loading state: Failed to open state file at gs://bkt-prj-b-seed-tfstate-XXXX/terraform/bootstrap/state/default.tfstate: googleapi: got HTTP response code 403 with body:
AccessDenied
Permission denied on Cloud KMS key. Please ensure that your Cloud Storage service account has been authorized to use this key.
From the error, you might have cryptoshredded yourself (deleting the encryption key makes resources completely inaccessible).
A few things to try:
- with default settings, you can probably manually restore the key within the soft delete period
- IAM issues in general you might try tweaking the permissions manually on console then reapplying
- Some state errors can occur when a terraform apply or destroy command fails midway through, tf state thinks an object exists but GCP never created it. See example and workaround at #1187
bucket_tfstate_kms_force_destroy and bucket_force_destroy ... any tips
- set
parent_folder
to a unique folder each time for isolated instances of the foundation deployed to that folder, instead of at org node. also create_unique_tag_key to true to avoid global clash at org - all of the force destroy vars to
true
In general yes, and we do have a number of sleep timers and retry logic where resources aren't available to reference on GCP immediately after terraform apply commands. However, I subsequently added some details to #1206 that identifies the root cause as a permissions issue, so I don't think adding more sleep timers would make a difference here.
I meant it as a way to confirm this is an issue by adding sleep timer(s) in the test (when creating required groups) to see if the seed_bootstrap module breaks with the error I experienced (thereby replicating my situation). This is a race condition in the end, isn't it?
Then test with my fix to see whether it ]addresses the issue or not. That's what I meant. Sorry for the confusion.
I did cryptoshred myself, didn't I? lol
Thanks for the tips.
From the discussion 1206 I don't think this is a race condition, it looks like different permissions applied when the service account creates groups (service account automatically gets OWNER permission on the Cloud Identity resources) vs when the user manually creates groups on Cloud Identity admin console (service account doesn't have any permissions for Cloud Identity, which manages permissions outside of GC IAM policies). I've made it a backlog item to improve the overall guidance to steer people away from this edge case in a future release.
I'll close this issue for now, but feel free to re-open if you disagree.
@eeaton My bad. I was referring to this issue, #1272, and NOT #1206 all this time. When you said:
#1273 has been merged, but from reading the details of #1206 I'm not certain whether this solves the issue. (...)
@lpezet can you please confirm whether you're still seeing the issue after this change?
I thought you meant whether fix #1273 addressed this issue #1272, based on what was said in #1206.
I can confirm fix #1273 worked for me but I would have liked to contribute a way to effectively test fix #1273 but I don't fully understand how the tests work and couldn't find anything relevant at first sight in test/integration/bootstrap/bootstrap_test.go.
Got it, thanks for clarifying.
If you're interested, here's a codelab introducing the test framework used by this repo and others based off of CFT:
https://codelabs.developers.google.com/cft-onboarding
For this particular repo, though, I think running all the tests locally is an unreasonable burden for contributors trying to make a small fix. (Even assuming everything goes smoothly, it takes multiple hours to deploy all the infra and run tests and tear down again)
When you raise a PR, all the tests run on the backend before it can be approved to merge. My practical rule of thumb for this enormous repo is to run the minimum locally: make docker_test_lint
and make docker generate_docs
to catch obvious issues, then leave the detailed tests to the CI workflow triggered on a PR.