cloudfoundry/bosh-agent

Scan & fix task succeeds but does not fix a failed deployment

Closed this issue · 2 comments

If a deployment fails with a new vm being created that has a running bosh agent but no existing monit jobs yet, a scan and fix task is triggered and succeeds although it does not fix the failing deployment.

The reason for this behavior is the Status method, which initializes the monit status with "running" and jumps over the loop of all services afterwards, because they are not registered yet. Since the monit status is still "running", the scan and fix task expects that there is nothing to do. As a result, the deployment is still not fixed.

Task 163 | 16:46:18 | Scanning 1 VMs: Checking VM states (00:00:04)
Task 163 | 16:46:22 | Scanning 1 VMs: 1 OK, 0 unresponsive, 0 missing, 0 unbound (00:00:00)

At the moment, a re-deployment (with changes in the manifest) which forces the vm to be recreated will overcome this hickup.

As a possible solution it might make sense to introduce a new status for this case or reuse the existing status "unknown".

This issue was marked as Stale because it has been open for 21 days without any activity. If no activity takes place in the coming 7 days it will automatically be close. To prevent this from happening remove the Stale label or comment below.

This issue was closed because it has been labeled Stale for 7 days without subsequent activity. Feel free to re-open this issue at any time by commenting below.