pbudzon/aws-maintenance

Retention Policy for the Snapshots

Closed this issue · 2 comments

@pbudzon : The cross region backup for aurora instances removes snapshots which are one day older. If we want to implement some retention policy like keeping all the snapshots for last 30 days & 1 snapshot for day 1 of each of the previous months upto six months? I can keep the backup for 30 days by changing snapshots_to_remove = [i[0] for i in sorted_snapshots[29:]]. But for keeping the snapshot for each month of the first day (for last six months), do i need to change the name of the snapshot? Could you please suggest

Hi @Prashast07
Number of snapshots to be retained could be a config value actually, but it's a simple change like you mentioned - the value in sorted_snapshots[x:].

For a more complex retention scenarios you could use tags to tag each snapshot with a deletion date - tags can be added to the snapshot when it's being copied (in both copy_db_cluster_snapshot and copy_db_snapshot). You can then implement any logic for determining the deletion date you want - in your example, backups made no the 1st of each month would get deletion date today + 6 months, all others would get deletion date today + 30 days.

You could then use the tag instead of the creation date (that's used right now) to determine whether to delete the snapshot. I imagine get_snapshots_list function would have to be modified to include the tag value instead (tags don't seem to be returned by the describe functions for snapshots, but list_tags_for_resource should get them) and then the logic around snapshots_to_remove list would need to be extended to check the tag date vs current date to list snapshots for deletion.

Would be great to see a PR with this change! :)

@pbudzon I have been able to achieve the desired result, by using the following

for i in data_source['DBClusterSnapshots']:
    snapshot_creation_time_unicode_source = i['SnapshotCreateTime'][:10]
    snapshot_creation_time_source = str(snapshot_creation_time_unicode_source)

    if snapshot_creation_time_source == current_date :
        source_cluster_snapshot_arn = i['DBClusterSnapshotArn']


if str(dt.now().strftime('%Y-%m-%d')) == '01':
    response = TARGET_CLIENT.copy_db_cluster_snapshot(
        SourceDBClusterSnapshotIdentifier= source_cluster_snapshot_arn,
        TargetDBClusterSnapshotIdentifier=target_cluster_snapshot_arn,
        KmsKeyId='arn:aws:kms:us-west-2:xxxxxxxxxxx:key/abc',
        CopyTags=True,
        Tags=[
            {
                'Key': 'Deletion_Date',
                'Value': (dt.now() + datetime.timedelta(days=30)).strftime('%Y-%m-%d')
            },
        ],
        SourceRegion= SOURCE_REGION
    )
else:
    response = TARGET_CLIENT.copy_db_cluster_snapshot(
        SourceDBClusterSnapshotIdentifier=source_cluster_snapshot_arn,
        TargetDBClusterSnapshotIdentifier=target_cluster_snapshot_arn,
        KmsKeyId='arn:aws:kms:us-west-2:xxxxxxxxx:key/abc',
        CopyTags=True,
        Tags=[
            {
                'Key': 'Deletion_Date',
                'Value': (dt.now() + datetime.timedelta(days=30)).strftime('%Y-%m-%d')
            },
        ],
        SourceRegion=SOURCE_REGION
    )

for i in data_target['DBClusterSnapshots']:
    oregon_cluster_snapshot_arn = i['DBClusterSnapshotArn']
    oregon_cluster_snapshot_identifier = i['DBClusterSnapshotIdentifier']
    count = count +1
    response = TARGET_CLIENT.list_tags_for_resource(
        ResourceName=oregon_cluster_snapshot_arn,
    )
    deletion_date= response['TagList'][0]['Value']

    if deletion_date == current_date:
        response = TARGET_CLIENT.delete_db_cluster_snapshot(
            DBClusterSnapshotIdentifier=oregon_cluster_snapshot_identifier
        )