wayfair-incubator/telefonistka

Race condition when adding a new component and deleting a stage

Ben10k opened this issue · 8 comments

Ben10k commented

Description

We had 2 PRs open at the same time:

  • PR1: remove a stage from environment:
    • delete the stage's targetPath from the root telefonistka.yaml
    • delete the the directory for the stage itself
  • PR2: add a new component the should be promoted to all stages defined in the telefonistka.yaml

It happened to be that PR2 was merged before the PR1.

After the merge of PR2, telefonistka kicked in and created a new promotion PR (PR3) of the new component to all stages, including the the one that is being removed in PR2. This is expected an correct behaviour, as at that point in time, the main branch still had the old stage present.

Then, PR2 was merged, which removed the stage from the main branch.

But the promotional PR3 still contained promotion to the stage that no longer exists on the main branch.

Expected Behaviour

After PR1 was merged, I expected that telefonistka would either:

  • edit the PR3 to remove the promotion of the component to a no longer existing stage
  • close the PR3 and create a new PR that would promote component only to existing stages

Actual Behaviour

Nothing happened after PR1 was merged.

Workaround

While debugging and trying to figure out what was actually happening, we have tried redelivering the webhooks from Github to telefonistka.

  1. We tried to redeliver webhook for PR1 pull_request_closed. Nothing happened
  2. We tried to redeliver webhook for PR2 pull_request_closed. Then telefonistka raised a new PR4 to deploy the new component from PR2 to the current existing stages in telefonistka.yaml on main branch.

Then we were able to close the PR3 and merge the PR4

Affected Version

v0.1.3

Steps to Reproduce

  1. Set up a repository which has telefonistka enabled.
  2. Have empty files:
  • _WORKSPACE/component1.yaml
  • region1/component1.yaml
  • region2/component1.yaml
  1. Have telefonistka.yaml with this content:
configuration
promotionPaths:
- sourcePath: "_WORKSPACE/"
  promotionPrs:
  - targetPaths:
    - "region1/"
    - "region2/"
  1. Create a PR (PR1) and do NOT merge it
  • delete file region2/component1.yaml
  • remove line - "region2/" from telefonistka.yaml
  1. Create a PR (PR2) and do NOT merge it
  • add a new empty file _WORKSPACE/component2.yaml
  1. Merge the PR2
  2. Verify that telefonistka created a PR3 to promote _WORKSPACE/component2.yaml to both region1/component2.yaml and region2/component2.yaml
  3. Merge the PR1
  4. Verify that the PR3 is still promoting _WORKSPACE/component2.yaml to the region2/component2.yaml while region2 is no longer present on the main branch.

Checklist

Oded-B commented

Hi @Ben10k, I've somehow missed this issue.
I'll take a look in the following few days.

Oded-B commented

Because Telefonistka is stateless I would have to compare the current config with a pervious version to find the removed path.
Then go over all currently open PRs and search for files that match the remove path, seems possible.

This is a big change (relative to size of the project) so the plan is to start with just commenting warnings(no deletion/mutation).
I'll Warn in the triggering PR (PR1) when its opened/synced and the affected PRs(PR3) when PR1 is merged.

Ben10k commented

Thank you!
Having a warning pop up as a comment on PR3 once PR1 is merged would be a nice addition.
Then we would know that something is not right and then we can change the promotional PR before approving/merging it.

Automatically marking issue as stale due to lack of activity

Automatically closing this issue as stale

Automatically marking issue as stale due to lack of activity

Automatically marking issue as stale due to lack of activity

Automatically closing this issue as stale