GSA-TTS/all_sorns

deployment script fails to check service status

Opened this issue · 1 comments

https://github.com/18F/all_sorns/blob/98ec018c306408f6161bc638c83ed9c8e997fc67/.cloud-gov/deploy.sh#L20

The deploy.sh file's service_exists function has flawed logic for ensuring a service is really available. Checking if service creation has been initialized can indeed be done with the line cf service "$1" >/dev/null 2>&1 but this does not check the actual status (and thus availability) of the service.

The status field in the cf service output can be checked to ensure that the service actually created successfully instead of "failed" or "in progress". This can be done by checking the status field in a loop with some time-based kill switch. The following should work so long as the successful status message for all services is "status: create succeeded" (which I'm not 100% sure of).

db_name="my-db-name"
service_status=$(cf service $db_name | grep "status:")
time_limit=600

while [ $time_limit -ne 0 ] && [ "$service_status" != "status:    create succeeded" ]; do
    echo "Waiting for service to become available..."
    sleep 100
    time_limit=$((time_limit - 100))
    service_status=$(cf service $db_name | grep "status:")
done

as it stands however, the script could run into the scenario where an RDS service has not finished provisioning yet is available from the cf services command (since the exit status of the command is still 0 so long as the service is in the list), leading the rest of the script to wrongly believe the service is ready.

Noting for posterity: This logic still exists, but is in setup.sh which doesn't try to deploy the app.