IndexError: list index out of range in CustomControlTowerStateMachineLambda lambda function
pimpuks opened this issue · 2 comments
Describe the bug
Encountered "IndexError: list index out of range" when accounts are provisioned to a previously empty OU and there are stack_set resources defined in the manifest.yaml targetted to the OU.
To Reproduce
- Set up Control Tower
- Create a new OU (for example "TEST")
- Deploy CfCT
- Update manifest to deploy a valid stack set to the "TEST" OU and upload the manifest file to s3 bucket (the CfCT pipeline and Step Functions are working fine when the OU is still empty)
- Provision a new account to "TEST" OU, Step Functions execution for the stackset instance is executed but failed at "CustomControlTowerStateMachineLambda"
Expected behavior
CfCT creates stackset instances successfully
Please complete the following information about the solution:
- 2.70 version
- [ ap-northeast -1 ] Region: [e.g. us-east-1]
- [ no ] Was the solution modified from the version published on this repository?
- If the answer to the previous question was yes, are the changes available on GitHub?
- [yes ] Have you checked your service quotas for the sevices this solution uses?
- [ yes ] Were there any errors in the CloudWatch Logs?
Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).
Additional context
Log messages from CloudWatch Logs of the Lambda function
{
"time_stamp": "2024-03-08 11:29:29,334",
"log_level": "INFO",
"log_message": {
"RequestType": "Create",
"ResourceProperties": {
"StackSetName": "CustomControlTower-xxxxxxx-stackset",
"TemplateURL": "https://customizedforcontrotowers-customcontroltowerpipeli-xxxxxxxx.s3.ap-northeast-1.amazonaws.com/_custom_ct_templates_staging/templates/backstage_cross_account_roles.yaml",
"Capabilities": "[\"CAPABILITY_NAMED_IAM\",\"CAPABILITY_AUTO_EXPAND\"]",
"Parameters": {
"BackstageAccountId": "xxxxxxx",
"BackstageStepFunctionsServiceProvisionRoleName": "xxxxxxx",
"BackstageStepFunctionsAccountProvisionRoleName": "xxxxxxx",
"SharedBackstagePortfolioId": "xxxxxxx"
},
"AccountList": [
"<AccountId1>",
"<AccountId2>",
"<AccountId3>",
"<AccountId4>",
"<AccountId5>"
],
"RegionList": [
"ap-northeast-1"
],
"SSMParameters": {}
},
"SkipUpdateStackSet": "yes",
"params": {
"ClassName": "CloudFormation",
"FunctionName": "list_stack_instances"
},
"LoopFlag": "not-applicable",
"StackSetExist": "yes",
"StackInstanceAccountList": []
}
}
Input StackInstanceAccountList is an empty list. However, the code checks only if the StackInstanceAccountList is not None, and tries to access the first account ID in the empty list which cause an error.
I manually added a stack instance (via console) to one of the target accounts, then trigger the pipeline again, the Step Functions execution is working fine.
@pimpuks thank you for reporting the bug and providing the context.
I have created an internal backlog to address this edge case and bug.
We hit the same bug. As no contributions are accepted, if anyone is reading this, we fixed it by changing the following code:
diff --git a/source/src/cfct/state_machine_handler.py b/source/src/cfct/state_machine_handler.py
index 961a747..30e65c2 100644
--- a/source/src/cfct/state_machine_handler.py
+++ b/source/src/cfct/state_machine_handler.py
@@ -223,7 +223,7 @@ class CloudFormation(object):
# to trigger create operation accordingly.
if (
not response.get("Summaries")
- and self.event.get("StackInstanceAccountList") is None
+ and (self.event.get("StackInstanceAccountList") is None or not self.event.get("StackInstanceAccountList"))
):
self._set_only_create_stack_instance_operation()
return self.event