AndrewFarley/AWS-Automated-Daily-Instance-AMI-Snapshots

Tagging volumes snapshots based on their previous tags...

AndrewFarley opened this issue · 6 comments

Did some brief investigations into this... would take some effort, will need to...

  • Sleep after creating AMI(s) and wait for all snapshots to appear
  • Because snapshots are not directly associated with an AMI, only a volume (with a description content which mentions the AMI) would need to scan through the snapshots to find the right one
  • Could then tag every snapshot with whatever tags that volume had

If only a few AMIs are created this is possible, but for a larger infrastructure/deployment this will fail, because when you schedule more than a few AMIs, they don't all run at once they can schedule to actually run a little later, which would push us over 5 minutes in Lambda if we had to wait that long.

Some ideas to achieve this might be to on a "second" run of this to try to tag older snapshots, as it's creating new AMIs and snapshots. Or to create a second lambda which does this automatically, but this feels like over-complicating something I want to keep fundamentally simple. Input/ideas welcome. Or I just make this an optional feature, and let people try it out. :P But leave it disabled by default

If only a few AMIs are created this is possible, but for a larger infrastructure/deployment this will fail, because when you schedule more than a few AMIs, they don't all run at once they can schedule to actually run a little later, which would push us over 5 minutes in Lambda if we had to wait that long.

@AndrewFarley Are you sure about this? Looking at my 19 AMIs, their time of creation are within 23 seconds of start time of lambda function.
~Samet

@charterresources I've seen this happen in the past numerous times (been working on AWS for way too long). So, under good conditions this never happens. I'm okay with implementing it, but it won't be enabled by default because I have no way to fail gracefully if I go over the 5 minute timeout of Lambda unless I implement state management and a DLQ and a retry mechanism based on the DLQ. But, I still plan to implement it because those snapshots without tags are really annoying. It's just a fundamental flaw in the way AWS does AMI snapshots (without tags) that I have to try to compensate for.

@charterresources If you want to test what I'm talking about... try running your lambda 5 times in a row very quickly, so you have 100 AMI's queue'd up, and then go look at the snapshot creation times, not the AMI creation times. In some brief investigation tests earlier this week I saw the problem on one of my environments... the create AMI request went through, but then for about 10 minutes I could not query about that AMI, the API responded with "unknown AMI". This is what I'm talking about, unfortunately. They have an "eventually consistent" model we have to work around. And even the AMI creation times... those don't reflect when they are available via the API. Try making requests against those AMIs immediately after creation, they will often fail.

Actually I just thought of, I can probably listen on the CloudWatch event of an AMI finishing being completed... and on that event trigger my lambda to go tag the snapshots... :) . Eh? That's a really good idea. I'll do some testing on that this weekend and if it isn't too hard I'll just implement it. Now that I think of that pattern, it shouldn't be too hard. If that event is thrown, we should be good!

@AndrewFarley
Great!!
thank you Andrew...

Hey @AndrewFarley, any chance of this being implemented? I've taken to running a separate script that copies the tags to the snapshots an hour or so after (we tag all resources with costcentres to bill the appropriate departments, so rather vital for our billing process).

Very keen on seeing cloudwatch event triggering the tagging instead! We have a function using a similar method - when it sees a snapshot being completed in cloudwatch, a lambda function is triggered to automagically create an AMI from it.