gruntwork-io/cloud-nuke

Timing out nuke through cli

Closed this issue · 8 comments

On few occasions, we have seen nuke is not able to clean things due to either using the new version & config not matching (guard duty cleanup) or using old version where s3 cleanup failed which were not empty, etc

But the main issue it keeps on trying to clean it up, and never gives up.

Is there any way we can set the retries? That after "x" retries if it didn't clean it should proceed further & in the end error out saying what was left behind, rather than getting stuck at one resource & keep trying to delete it? thank you.

anyone can respond, please?

Hi @denmanveer,
Only a few resources in cloud-nuke have retries configured, they all have max retry counts that are not configurable from the cli. For S3 specifically, when cloud-nuke encounters a non-empty bucket it will attempt to delete ALL objects within the bucket. Depending on the number of objects this could make it appear as if cloud-nuke is stuck. S3 does not provide an api for retrieving the object count without iterating through all objects so for efficiency we do not provide a progress indicator in this case.

thank you @ellisonc is there a way we can (MF paid customer) request for that feature to be implemented?
So, that we can pass timeouts to the CLI command itself or through the config file where we include & exclude resources, we can control the retries too there? thank you.

I don't think this is a problem with retries, but a timeout feature might be possible in the future. I'll discuss with our product manager. Until then I would suggest using a tool like timeout to wrap cloud nuke or set a timeout in whatever automation tooling is used to invoke cloud-nuke.

hey @ellisonc sorry I meant the timeout only (I started with timeout only, don't know why I mentioned retries :D)

that will be great, thank you, will keep following this thread.

Hey @denmanveer, were you expecting to set timeout per resource or the overall timeout on the command you run?

Hey @denmanveer, were you expecting to set timeout per resource or the overall timeout on the command you run?

hey @hongil0316 , timeout on per resource, like it tries to clean the s3 bucket, but if the bucket is not empty, it will keep re-trying to clean the s3 bucket.
Rather if we can add some re-tries, so lets say it re-tries for 5 times, then it moves to the next step e.g. cleaning ec2, etc and goes on.
and in end, the summary shows only s3 failed, rest all proceeded.
something like that.

Got it. Looked into the code but it would require quite significant amount of changes to implement something like this: https://golangbot.com/context-timeout-cancellation/.

Potentially passing around context and stopping the nuking operation when it times out. Can look into it later when I get a chance! Otherwise, feel free to propose a PR for it. I would love to review as well