aws/aws-cdk-rfcs

Deployment Triggers

mneil opened this issue ยท 38 comments

mneil commented

Description

Allow specifying arbitrary handlers which execute as part of the deployment process and trigger them before/after resources or stacks.

Published: https://github.com/awslabs/cdk-triggers

README

Hypothetical README for this feature

You can trigger the execution of arbitrary AWS Lambda functions before or after resources or groups of resources are provisioned using the Triggers API.

The library includes constructs that represent different triggers. The BeforeCreate and AfterCreate constructs can be used to trigger a handler before/after a set of resources have been created.

new triggers.AfterCreate(this, 'InvokeAfter', {
  resources: [resource1, resource2, stack, ...],
  handler: myLambdaFunction,
});

Similarly, triggers.BeforeCreate can be used to set up a "before" trigger.

Where resources is a list of construct scopes which determine when handler is invoked. Scopes can be either specific resources or composite constructs (in which case all the resources in the construct will be used as a group). The scope can also be a Stack, in which case the trigger will apply to all the resources within the stack (same as any composite construct). All scopes must roll up to the same stack.

Let's look at an example. Say we want to publish a notification to an SNS topic that says "hello, topic!" after the topic is created.

// define a topic
const topic = new sns.Topic(this, 'MyTopic');

// define a lambda function which publishes a message to the topic
const publisher = new NodeJsFunction(this, 'PublishToTopic');
publisher.addEnvironment('TOPIC_ARN', topic.topicArn);
publisher.addEnvironment('MESSAGE', 'Hello, topic!');
topic.grantPublish(publisher);

// trigger the lambda function after the topic is created
new triggers.AfterCreate(this, 'SayHello', {
  scopes: [topic],
  handler: publisher
});

Requirements

  • One-off exec before/after resource/s are created (Trigger.AfterCreate).
  • Additional periodic execution after deployment (repeatOnSchedule).
  • Async checks (retryWithTImeout)
  • Execute on updates (bind logical ID to hash of CFN properties of triggered resource)
  • Execute shell command inside a Docker image

Use Cases

Here are some examples of use cases for triggers:

  • Intrinsic validations: execute a check to verify that a resource or set of resources have been deployed correctly
    • Test connections to external systems (e.g. security tokens are valid)
    • Verify integration between resources is working as expected
    • Execute as one-off and also periodically after deployment
    • Wait for data to start flowing (e.g. wait for a metric) before deployment is successful
  • Data priming: add data to resources after they are created
    • CodeCommit repo + initial commit
    • Database + test data for development
  • Check prerequisites before depoyment
    • Account limits
    • Availability of external services
  • Connect to other accounts

Implementation

At the base level, the trigger handler can be invoked through a custom resource and the timing (before/after) will be determined using CFN dependencies ("after" means the trigger CR depends on the scope, and "before" is the opposite).

This simple implementation will allow us to implement "one-off" triggers. This means that we wait for a CFN CREATE request on the custom resource and invoke the handler. Any updates to the stack will not include any changes to the properties of the custom resource and therefore the trigger won't get invoked again (unless it's removed).

We need to consider the following:

  • If the trigger handler itself changes, do we want it to be invoked again?
  • If the triggering resource is updated, do we want the trigger to be invoked again?
  • Do we want some kind of support for triggers that always gets invoked (for any update)?
  • Do we want triggers for "AfterDelete" or "AfterUpdate" does that make sense?

Lots to talk about!

Next Steps

  • Least-privilege IAM policy for trigger custom resource provider (currently it's invokeFunction for * resources).
  • Invoke trigger if another resource is added to the stack (even if the trigger has already been created).
  • Consider adding support for "update" triggers (if the triggering resource has been updated).

Related Issues

See #75 for a discussion, then, use these for e.g. integration test assertions (#31)

Progress

  • Tracking Issue Created
  • RFC PR Created
  • Core Team Member Assigned
  • Initial Approval / Final Comment Period
  • Ready For Implementation
    • implementation issue 1
  • Resolved
eladb commented

@mneil this is an interesting proposal. I can't say I have a good idea on how to implement something like this in our current architecture, but we might be able to achieve something like this when we generalize through aws/aws-cdk#233.

Do you have other use cases in mind that you can share?

mneil commented

Another thing I can think of is the need to peer cross region vpcs. Currently I believe cfn only works in a single region with the peer resource. If I want to peer two vpcs in different regions I need to launch my stack in two regions with different cidr blocks then use the sdk to run the peering command.

Without events is have to run the cdk code, let it exit, then launch a separate process to do the peering.

I think the solution to each of these cases is going to be Custom Resources (you can run arbitrary code in a Lambda during the CloudFormation workflow). You can write these yourself, today, and they can do what you need. It cannot be anything else because the CDK app executes completely before the CloudFormation deployment starts, and so before the first bucket gets created.

If you want to define them as part of your CDK app, it follows that we're going to have to transparently generate Lambdas for you. It can be done, but there's a good chance it won't work the way you expect it to. For example, any state in your CDK app that your even handler closes over is going to be very hard to transport into the Lambda when it executes at some indeterminate point later in time.

mneil commented

Yes, I'm aware I can use a custom resource.

Right now I'm solving it with code pipeline and code deploy.

I guess I just imagined that if I could use a programming language to compose my stack then I could also use that stack within the context of other code - more like the sdk. As it stands now a cdk application must stand alone and couldn't be triggered within a larger application.

If I wanted to maybe make my cdk app into a micro service I'd need to write a program to accept an incoming connection, then spawn a separate cdk process, wait for it to exit, read the response code, then respond.

Even just adding some lifecycle hooks to the app object would make it far more extensible. I could even attempt a first pass on this at the app level and open a pr. I don't want to spend time on it though if it's not something anyone wants to add.

Okay, I think I see what you're saying. During the course of a deployment we can call the CDK app again, but instead of synthesizing we can invoke arbitrary callbacks, depending on the progress of the deployment.

It's an interesting notion. Definitely not on our current roadmap, but I'd be interested to see a PoC, and more importantly hear of some use cases that were best addressed this way (and not using Custom Resources for example).

Also I don't quite get where the need for a microservice is coming from?

Alternatively, you could implement some constructs right now that have this feature to solve your own use cases, and vend those as an extension library to the CDK.

At this stage (personally for me), it would be sufficient to just have the "hooks" after the stacks are deployed.

Similar to aws/aws-cdk#1938 but on the client-side.

The use-cases:

  • use AWS API to amend the resources that CloudFormation doesn't support (#1938 is probably better for that, but local update often times is enough)
  • pushing notifications, outside of the account where the deployment happens
  • confirming a success/failure and the details for the automation scripts

Possible API (not well thought through) could look like:

// App-level - constructor parameters
const app = new cdk.App(hooks: { deployed: (s: Stack, outcome: cdk.OutcomeDetails)=>{...} });
// ...
app.run()



// App-level - promise-like
const app = new cdk.App();
// ...
app.run().whenDeployed().then((stack, outcome) => { ... });



// Stack override
export class Stack1 extends cdk.Stack {
    constructor(scope: cdk.App) { ... }

    whenDeployed(outcome: cdk.OutcomeDetails) {
      // default implementaion - noop
    }
}

const app = new cdk.App();
const stack1 = new Stack1(app);

Out of the above, I think my current preference is Stack override as it would make sense in my use-case - amend Cognito client OAuth details. This option keeps the stack-related details closer together.

But the application-level options are fine too and probably are better for other use-cases.

I would also like to see this supported. My use case is that I need to generate some k8s manifests after creation/update

eladb commented

We will likely not have time to look at this in the coming months. In the meantime, @itajaja can you implement you k8s manifest creation as a custom resource. Sounds like it might be a better fit.

well, Ideally, i'd like them to be saved as files to disk

eladb commented

You can emit them during synthesis and then treat them as assets, which will automatically be uploaded to S3 for you (like Lambda code bundles). Then, reference them from a custom resource and configure your k8s cluster.

I guess I am missing some terminology. what's synthesis? what's an "asset"? is an asset an "output"?

eladb commented

"synthesis" basically means that you can create this file as part of the execution of your CDK app (which is called by cdk synth). Assets are local files (or Docker images) that are uploaded to S3/ECR as part of "cdk deploy" and their location is made available to your CDK app. Here is the README file for the assets library.

thanks for the rundown, that's a lot of useful information!

just fyi, using assets, or doing it during synthesis, might not be enough, because I need output ARNs to include in my manifests, that's why it needs to be done on completion

I'm building an ETL pipeline that ingests survey data from Typeform. Part of the stack is an API Gateway endpoint that implements a Typeform webhook (where Typeform POSTs survey submissions). After successful deployment of the stack, I'd like to programmatically set the URL to my newly-deployed API endpoint. It seems like I need some post-deployment hook where I have access to the generated URL.

My alternative is to wrap cdk deploy in a script that looks up the URL using the AWS SDK, then calls the Typeform developer APIs to setup the webhook with that URL.

I'm interested in any progress or thoughts on this feature request.

eladb commented

This should be quite trivial to implement with a custom resource, and will also allow you to react to updates/deletes in the URL. It does seem like a common pattern that we can probably generalize (basically offer a construct backed by a custom resource that will issue http requests for create/update/delete). Sounds like something @jogold would enjoy working on :-)

basically offer a construct backed by a custom resource that will issue http requests for create/update/delete

This should be already possible with a AwsCustomResource issuing publishMessage calls to a SNS topic. You can then have your HTTP endpoint subscribed to this topic.

Any updates on this?

Another usecase for me:

i can create a codecommit repo in cdk, what i want to be able to do then, is that after that repo is created (deployed), trigger LOCAL steps to get the URL, add it to the local git then push. OR create a git submodule with the URL, then add/commit/push on that. currently i have to do that manually and it takes a couple of steps (including having to login to the UI or use the cli to get the URL), not ideal when im sure this can be fully automated :)

Can we actually change the title back to "Constructs should emit events"? There is a new section of the CloudFormation template called "Hooks": https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/blue-green.html#blue-green-template-reference , used in the blue-green deployment for ECS.

We will need to support this new section of the template at some point, and I think it will be very confusing to have "Construct events" also be called "Hooks".

eladb commented

@skinny85 renamed to "Triggers". Hope that's better

It is, thanks :).

eladb commented

From aws/aws-cdk#11344:

I'm interested in some sort of hooks framework added to the CDK so that one can declare logic to execute after / before a cdk command. This would leave the use experience tied to the CDK user experience, instead of wrapping it in shell scripts etc.

Use Case

  1. Authentication workflows, especially until SSO works correctly.
  2. Creating .env files for local testing to know which resources to use after cdk deploy (e.g. a lambda or dynamo table to invoke)
  3. Emptying s3 buckets before cdk destroy

Proposed Solution

I don't have one, but ansible does this and its really useful https://github.com/openshift/openshift-ansible/blob/master/HOOKS.md

Just ran into a situation where i need to run some integration tests in my ci pipeline, some event emitter/hooks would be awesome to bundle that within cdk.

eladb commented

Thanks for everyone who attended CDK Construction Zone. We started building this in the first episode. Code is here: https://github.com/eladb/cdk-triggers

We will continue the implementation in the next episode of "CDK Construction Zone", happening on Feb 23rd 9AM PDT. Check out the AWS twitch channel schedule for more details.

Recording of the first episode is now available on the AWS Twitch channel: https://www.twitch.tv/videos/917691798

awesome sauce :D looking forward to using this!

Question @eladb, will this only be for running lambdas or will we be able to run local scripts with the triggers too? e.g. deploy codecommit, trigger when its deployed successfully to grab the repo URL and add it as the remote to the local cdk project git.

eladb commented

@binarythinktank asked:

will this only be for running lambdas or will we be able to run local scripts with the triggers too? e.g. deploy codecommit, trigger when its deployed successfully to grab the repo URL and add it as the remote to the local cdk project git.

At the moment, this is focusing on deploy-time actions, but I'd like to hear more about your use case. It sounds like you are looking for a way to "bootstrap" CDK projects, right? Can you provide some more context?

A few use cases. If we can access the stack vars after deployment such as names, arns and parameters of the deployed services that are not known pre-deployment, and we can trigger events after a service and/or stack deployment, and then run code locally with that information, it opens a whole bunch of automation opportunities.

  • configure local git as mentioned, even do a first commit & push
  • deploy templates, config files, bootstraps, etc into the local directory from a shared location
  • useful feedback to the Dev such as API and ui URLs of what was deployed
  • automatically commit/push after a successful deployment (potentially to trigger any codepipelines for deploying ui to s3 or similar)

I'm sure I can think of more things that would be useful to me/my clients but it's getting late here :) perhaps others could chime in?

Another good example would be setting up a client or site-to-site VPN in the VPC and then from the deployment machine, configuring the VPN software to connect to that VPC.

I'm currently working on a solution for synthing a Service Catalog Product template from CDK, pushing that and its assets to S3, and then deploying a second CDK app with the SC Product definition. Right now I'm doing it with an external Python CLI script, but having triggers or hooks like this in place would make it so I didn't have to have a separate tool.

BTW, the CDK Construction Zone is a fantastic idea. I can't wait to watch the recording, and hope you do more in the future!

Additional usecase from what I'm currently working on.

Pre-destroy hook: If I have a stack with an ECS cluster, running 'cdk destroy' will show CF errors because the cluster has active nodes (not drained). I'd like to have a way to write some code to drain these nodes before starting the destroy (I know I can do this outside of CDK, but it would be nice to be self contained)

CDK Triggers have been released: https://github.com/awslabs/cdk-triggers

https://stackoverflow.com/questions/65773331/how-to-enforce-standards-and-controls-when-using-cdk-pipeline documents a use-case that I am currently struggling with. Here's a quick summary -

We are adopting CDK and CDK Pipeline in our firm. We want to let developers customize the CDK Pipelines to their heart's content. But we also want to make sure that they at least follow some pre-defined standards - for e.g. the pipeline should always contain a Manual Approval Action before the stage that deploys to a cross-account prod environment (the dev role doesn't have the permissions to approve/reject this action).
I do not have much idea on how to enforce this without building a custom wrapper library and forcing developers to use it instead of vanilla CDK libraries.

One option that comes to mind is to have a lambda trigger before actual CFN template deployment (or preferably, after the assets get published) which validates that the synthesized CFN template adheres to the standards. This again should require to be configured at the time of bootstrapping - because that's the only stage where the resources and configuration done are universally applied to each and every CDK app and we are not reliant on developers to include this validations when defining their pipeline.
But again, I am not exactly sure how to implement this. The title of this issue brought me here - I thought if we can configure some tiggers at the time of bootstrapping itself which would always run the validation logic on the final CFN template before it gets deployed, the problem would be solved.

Would really appreciate if others can share their thoughts on this.

eladb commented

This capability is now available as part of the AWS CDK: https://github.com/aws/aws-cdk/tree/master/packages/%40aws-cdk/triggers