Adding a start/stop mode to terminate instances as required without waiting for 60 minutes to TTL to expire.
Closed this issue · 3 comments
There should be a start/stop mode for starting the instance when required and stoppping it when its job is done. TTL is good but it won't be useful if my job is done in 5 minutes why would I pay for the rest of 55 minutes as 1 minute is minimum charge per-instance. Please look into this asap, then this action will be unbeatable.
Instances are supposed to terminate about 60 seconds after job completion (failure or success). The 60 seconds grace-period is to give the code a chance for calling GH API in order to perform necessary cleanups.
Are you saying instances stay around after a job has completed ?
`echo "shutdown -P +1" > $CURRENT_PATH/shutdown_script.sh`,
"chmod +x $CURRENT_PATH/shutdown_script.sh",
`echo "./config.sh remove --token ${runnerRegistrationToken.token} || true" > $CURRENT_PATH/shutdown_now_script.sh`,
`echo "shutdown -h now" > $CURRENT_PATH/shutdown_now_script.sh`,
"chmod +x $CURRENT_PATH/shutdown_now_script.sh",
"export ACTIONS_RUNNER_HOOK_JOB_COMPLETED=$CURRENT_PATH/shutdown_script.sh",
The code above is part of our startup cloud-init script.
- It creates a shutdown script and then uses ACTIONS_RUNNER_HOOK_JOB_COMPLETED to make sure it is executed once a job finishes.
- We also have
github_job_start_ttl_seconds
which defines how long an instance is allowed to stay idle before a job is executed - Finally we have the instance TTL which would execute if the two options above both fail for any reason.
I just tested with a job which had an error intentionally introduced to make it fail. Exactly 1 minute after failure the instance was terminated.
Do you have an example of a workflow which could trigger a different type of failure ?
Yes, This works, Thank you so much, I will use this in my workflow.