The AWS CloudFormation documentation includes a list of all the possible stack states:
CREATE_COMPLETE
CREATE_IN_PROGRESS
CREATE_FAILED
DELETE_COMPLETE
DELETE_FAILED
DELETE_IN_PROGRESS
REVIEW_IN_PROGRESS
ROLLBACK_COMPLETE
ROLLBACK_FAILED
ROLLBACK_IN_PROGRESS
UPDATE_COMPLETE
UPDATE_COMPLETE_CLEANUP_IN_PROGRESS
UPDATE_IN_PROGRESS
UPDATE_ROLLBACK_COMPLETE
UPDATE_ROLLBACK_COMPLETE_CLEANUP_IN_PROGRESS
UPDATE_ROLLBACK_FAILED
UPDATE_ROLLBACK_IN_PROGRESS
However, it does not make it clear how the states relate to each other.
This document aims to rectify that.
Caveat lector: This document reflects my understanding of AWS CloudFormation. It is not definitive. I do not work for AWS. I've been using CloudFormation for a few years now, but there are features of it that I've not yet used. For all of those reasons, this document may not be 100% accurate.
But I hope that it's helpful.
With that said, let's get going.
At a high level, the lifecycle of a stack is:
- the stack initially does not exist;
- the stack is created;
- the stack is updated (0 or more times);
- the stack is deleted;
- the stack finally does not exist.
which you can visualise as follows:
However, "CREATED", "UPDATED" and "DELETED" are not real stack states – just a simplified model. So let's talk specifics.
If we ignore all the error cases, the state diagram is as follows:
- the stack initially does not exist;
- the stack is created:
- by first passing through
CREATE_IN_PROGRESS
- before coming to rest at
CREATE_COMPLETE
;
- by first passing through
- zero or more times, the stack is updated, where each update consists of:
- first passing through
UPDATE_IN_PROGRESS
- then passing through
UPDATE_COMPLETE_CLEANUP_IN_PROGRESS
- before coming to rest at
UPDATE_COMPLETE
;
- first passing through
- the stack is deleted:
- by first passing through
DELETE_IN_PROGRESS
- before coming to rest at
DELETE_COMPLETE
;
- by first passing through
- the stack finally does not exist.
A couple of things to note here:
Firstly, each stack state can be considered either to be "in motion" (all the stacks named ..._IN_PROGRESS
), or "at rest" (everything else).
Secondly, note that once a stack has been deleted (DELETE_COMPLETE
), the
stack remains present in AWS CloudFormation for the next 90 days, so that the
stack's metadata and events (log) can be inspected. After 90 days, the stack
disappears from CloudFormation.
There's also the REVIEW_IN_PROGRESS
state, which (when used) sits right near the top,
just after "the stack initially does not exist", but before
CREATE_IN_PROGRESS
. REVIEW_IN_PROGRESS
is used if the stack is created
via a change set.
But: what about the error cases?
Let's complete the diagram by adding in all the states related to error handling:
To make the diagram simpler, I've added a few "pseudo-states", to represent abstract concepts like "a stack is updateable". I've also assumed that some of the states that can be passed through are (in a manner of speaking) always passed through, even if the stack might spend zero time in that state.
To break this down, let's look at how stacks are created, updated, and deleted.
Part of CloudFormation's job is to work out "how to get there from here" – what resource creations, updates and deletes need to happen, and in what order, to reach the requested outcome.
The CreateStack, UpdateStack and DeleteStack API calls all make use of this approach: validate the request, work out what changes will be required, then make those changes. The Change Sets API calls can be used to inspect the planned changes before they are performed.
The changes always break down into:
- 0 or more resource creates / updates;
- followed by 0 or more resource deletions.
Some "updates" can be performed by updating the existing resources in-place; some work by doing a create of a new resource, then deleting the old one.
If anything goes wrong, then CloudFormation may attempt to roll back to the previous steady state. This is done essentially by replaying the steps done so far, but in reverse order, and with the sense of all the changes reversed (create instead of delete, and vice versa).
For stack creation, the planned changes will always consist of resource creations only (no updates, no deletes):
If anything goes wrong, then any resources that were created need to be
deleted (the ROLLBACK_IN_PROGRESS
/ ROLLBACK_FAILED
/ ROLLBACK_COMPLETE
states).
UpdateStack is the most complex case, because it may include resource creates, updates, and deletes. And if anything goes wrong, this means that the rollback too might involve deletes, updates, and creates:
If a stack reaches the UPDATE_ROLLBACK_FAILED
state, the options are to
attempt deletion (DELETE_IN_PROGRESS
), re-attempt rollback
(UPDATE_ROLLBACK_IN_PROGRESS
), or to contact AWS support. See
the AWS blog post on "continue update rollback"
for more information.
DeleteStack is reasonably straightforward; for each resource that is not already deleted, and for which the DeletionPolicy says to delete the resource, the resource deletion will be attempted. In case of problems, rollback (i.e. recreating resources) is not performed.
If there are problems, the stack goes to the DELETE_FAILED
state. Once the
reason for the deletion failure has been identified and fixed, DeleteStack
can be called again to re-try the deletion of any resources that have not yet
been deleted.