Error: cannot list resource "pods" in API group
vburckhardt opened this issue · 3 comments
Affected modules
Consumer reporting this:
2024/02/23 16:05:59 Terraform apply | module.roks-cluster.null_resource.confirm_network_healthy[0]: Creating...
2024/02/23 16:05:59 Terraform apply | 2024-02-23T16:05:59.963Z [INFO] Starting apply for module.roks-cluster.null_resource.confirm_network_healthy[0]
2024/02/23 16:05:59 Terraform apply | 2024-02-23T16:05:59.963Z [DEBUG] module.roks-cluster.null_resource.confirm_network_healthy[0]: applying the planned Create change
2024/02/23 16:05:59 Terraform apply | module.roks-cluster.null_resource.confirm_network_healthy[0]: Provisioning with 'local-exec'...
2024/02/23 16:05:59 Terraform apply | module.roks-cluster.null_resource.confirm_network_healthy[0] (local-exec): Executing: ["/bin/bash" "-c" ".terraform/modules/roks-cluster/scripts/confirm_network_healthy.sh"]
2024/02/23 16:05:59 Terraform apply | module.roks-cluster.null_resource.confirm_network_healthy[0] (local-exec): Running script to ensure kube master can communicate with all worker nodes..
2024/02/23 16:06:00 Terraform apply | module.roks-cluster.null_resource.confirm_network_healthy[0] (local-exec): Error from server (Forbidden): pods is forbidden: User "IAM#serviceid-XYZ" cannot list resource "pods" in API group "" in the namespace "calico-system"
2024/02/23 16:06:00 Terraform apply | module.roks-cluster.null_resource.confirm_network_healthy[0] (local-exec): Success! Master can communicate with all worker nodes.
2024/02/23 16:06:00 Terraform apply | module.roks-cluster.null_resource.confirm_network_healthy[0]: Creation complete after 0s [id=]
I suspect this is happening at
Consumer state that serviceId has got necessary permission. Assuming this is correct, the issue could potentially be caused by delays in RBAC sync - would be good to double check if this lines is in the CI logs.
The other aspect is that the check does not happen if the kubectl get pods returns an error (but the message says success).
Terraform CLI and Terraform provider versions
- Terraform version:
- Provider version:
Terraform output
Debug output
Expected behavior
Actual behavior
Steps to reproduce (including links and screen captures)
- Run
terraform apply
Anything else
By submitting this issue, you agree to follow our Code of Conduct
Setting admin = true
on the data
terraform-ibm-base-ocp-vpc/main.tf
Line 255 in a01bdec
I don't think we should use the admin flag - I think there may be audit concerns with that. Instead I think the fix is either to add a retry to the first kubectl command with a sleep, as RBAC sync is usually ready in a matter of seconds
Consumer is using admin flag - I think we should do the same. Admin config is pulled after auth on ibmcloud cli with an identity, so it should be possible to correlate if needed. If this becomes an issue from an audit perspective, some coordination would be needed around having that admin identity disabled through the ROKS stack, SCC scan tracking this, etc.