cloudfoundry/cloud_controller_ng

`TASK_STOPPED` usage events can be emitted for tasks that were never actually running

tcdowney opened this issue · 0 comments

Issue

There is an issue where TASK_STOPPED app usage events can be created for a task that was never actually running.

Context

TASK_STARTED usage events are created when a task transitions from PENDING to RUNNING and is successfully scheduled on Diego. TASK_STOPPED usage events should be created when a RUNNING task eventually reaches a final state (e.g. SUCCEEDED or FAILED). Currently if a task is PENDING and transitions directly to FAILED it will result in a TASK_STOPPED usage event being created with no corresponding TASK_STARTED event. This can confuse users or systems that are tracking app usage events.

Steps to Reproduce

  1. cf push an app
  2. bosh ssh on to the diego-api VM
  3. sudo su - and monit stop bbs
  4. cf run-task APP_NAME --command "echo hello" --name my-pending-task
  5. Wait for the cf run-task command to fail (may take several minutes`
  6. cf curl /v3/app_usage_events and see that there is a TASK_STOPPED app usage event for the my-pending-task task, but no TASK_STARTED event

Expected result

There should be no app usage events for this task because it was never scheduled and run on the platform.

Current result

This causes confusion to end users who are tracking these usage events.

Possible Fix

Only create TASK_STOPPED events if there is an existing TASK_STARTED event.


Other issue -- @sethboyles has also reproduced an issue where duplicate TASK_STOPPED events can be created if a CANCELING task is rapidly canceled multiple times in quick succession. This can occasionally result in extra events and we should see if we can address that as well.