Missing IAM Role permissions for Distributed Map states
timorthi opened this issue ยท 11 comments
This is a Bug Report
Description
- What went wrong?
(Preface: In this issue I used an unreleased version of this plugin which incorporates the changes from #536)
When executing a state machine with a Map state in Distributed mode, the execution fails at the Map state because execution role does not have sufficient permissions to run a Distributed Map.
This is due to the required additional permissions, where the execution role for the state machine containing the Distributed Map must be able to StartExecution on itself: https://docs.aws.amazon.com/step-functions/latest/dg/iam-policies-eg-dist-map.html
- What did you expect should have happened?
The Distributed Map state should have run without permissions issues.
- What was the config you used?
Not really applicable here, just a standard distributed map state:
MyMapState:
Type: Map
ItemProcessor:
ProcessorConfig:
Mode: DISTRIBUTED
ExecutionType: STANDARD
StartAt: FooState
States:
FooState:
....
- What stacktrace or error message from your provider did you see?
This is the error that appears on the AWS Dashboard during the execution of the state machine:
User: arn:aws:sts::12345:assumed-role/MyStateMachineRole/abcdef is not authorized to perform: states:StartExecution on resource: arn:aws:states:us-east-1:12345:stateMachine:MyStateMachineName because no identity-based policy allows the states:StartExecution action (Service: Sfn, Status Code: 400, Request ID: some-uuid)
Additional Data
- Serverless Framework Core Version you're using: 3.25.1
- The Plugin Version you're using: Unreleased; my package.json points to this commit: 42bd423
- Operating System: N/A
- Stack Trace: N/A
- Provider Error messages: N/A (see description above)
I've been able to brute-force a fix by adding the following rule to the set of IAM Permissions:
timorthi@fd1b36a#diff-5efbd2d24990cd41d11040072f12d980a6ae28e66faef40a046e8d30d69ff528R580-R590
There is probably a better/cleaner way to add the permissions based on the detection of a Distributed Map state, however it appears that getTaskStates ignores "container" states like Maps and Parallels so there's no easy way to detect a Distributed Map without changing getTaskStates.
@timorthi Please test the IAM roles with latest commit. Looking forward for your feedback.
@akshaydk Hey there, thanks for the help. I tried out the latest commit on master fcf2b3ca811581b4cdb41fe926d33ce928a8398d but it appears the Resource is incorrect.
I think it's currently fetching the state machine "name" as defined by its key in serverless.yml (stepFunctions.stateMachines.foo) : https://github.com/akshaydk/serverless-step-functions/blob/c7d22ba097a5316558e8601beaf094c071737715/lib/deploy/stepFunctions/compileIamRole.js#L314-L318
However, we also need to take into account the state machine object's name attribute (stepFunctions.stateMachines.foo.name), if it exists, when constructing the ARN for the policy. Here is my "brute force" solution where I fetch that attribute.
That should then accommodate this scenario: https://github.com/serverless-operations/serverless-step-functions#adding-a-custom-name-for-a-statemachine
Hey @akshaydk + @timorthi are either of you working on a patch for this?
I've also hit this issue and have some time available to look into either merging timorthi's fix (which I've confirmed works locally) or finding where the original states:StartExecution addition needs to be updated and submitting that for review
@toddhainsworth I am not and that would be awesome, thank you! I think we are better off adjusting @akshaydk 's previous PR since that was reviewed and approved already, instead of using my fix.
Makes sense to me, thanks @timorthi
I'll likely get started on this today and have a PR to you both by Thursday this week ๐
I was able to get it working with and without a name much easier than I expected ๐
#552
@akshaydk Hey there, thanks for the help. I tried out the latest commit on master
fcf2b3ca811581b4cdb41fe926d33ce928a8398dbut it appears theResourceis incorrect.I think it's currently fetching the state machine "name" as defined by its key in serverless.yml (
stepFunctions.stateMachines.foo) : https://github.com/akshaydk/serverless-step-functions/blob/c7d22ba097a5316558e8601beaf094c071737715/lib/deploy/stepFunctions/compileIamRole.js#L314-L318However, we also need to take into account the state machine object's
nameattribute (stepFunctions.stateMachines.foo.name), if it exists, when constructing the ARN for the policy. Here is my "brute force" solution where I fetch that attribute.That should then accommodate this scenario: https://github.com/serverless-operations/serverless-step-functions#adding-a-custom-name-for-a-statemachine
@timorthi I'm sorry for late reply. I was bit busy with my day job last couple of weeks.
I was able to get it working with and without a name much easier than I expected ๐ #552
@toddhainsworth Thanks for the PR. Sorry, I couldn't help you much.
๐ This issue has been resolved in version 3.13.0 ๐
The release is available on:
Your semantic-release bot ๐ฆ๐
With this change I think it fixed the execution role for named state machines but breaks if you are missing the Name property in your state machine definition. As a workaround I named my state machine but it got recreated and lost all my previous execution history.