sigstore/root-signing

Timestamp should either bypass preprod, or sync faster

Closed this issue · 13 comments

Description

The expiration is short enough and the risk is low that going straight to prod is reasonable. Alternatively, reducing the time the timestamp sits in preprod would be an option.

@haydentherapper @hectorj2f and I talked about this today and the proposed fix is to update sync-preprod-to-prod.yml such that it will sync if more than one day has passed since publish to preprod OR the only changes to preprod are updating the timestamp or snapshot.

@lkatalin I like the idea to improve the logic however I feel we still need to reduce the amount lower than a day. Otherwise we'll face the same problem again.

Regarding the updates in the timestamp or snapshot, which exactly changes would you be looking for e.g. here https://github.com/sigstore/root-signing/pull/907/files ?

I’ve been trying to figure out the most straightforward way to implement this. I’m thinking of doing it in the sync-main-preprod action and just bypassing preprod. There’s an Action I found that returns a list of modified files - if it’s “timestamp” or “timestamp, snapshot”, then we simply sync to prod too.

I’m thinking of doing it in the sync-main-preprod action and just bypassing preprod.

Should the sync-main-preprod action be syncing anything to prod? That could potentially get confusing. I thought what you described yesterday was to bypass the "> 24 hours since preprod modification" check in the preprod-to-prod action. Or maybe there could be a separate workflow that only syncs timestamp-only or snapshot-and-timestamp-only updates to main, and bypasses preprod?

Thinking a bit more about this, particularly about the purpose of preprod - Preprod was meant for risky, manual changes, to be able to verify common workflows (with a prober) against a GCS bucket. Automated events are low risk and don't need a staging bucket. Maybe at one point, when we were doing a lot of changes, there was some risk, but now timestamp/snapshot generation is very stable.

I'd like to propose the following:

  • Timestamp/snapshot updates always go straight to production
  • Preprod-to-prod syncs are no longer automated, only supporting a manual trigger
  • Root/target/delegate metadata updates will require manually running the preprod-to-prod sync. We'll update the ceremony steps to include running the probers manually and kicking off the sync manually.

Timestamp/snapshot updates always go straight to production
Preprod-to-prod syncs are no longer automated, only supporting a manual trigger
Root/target/delegate metadata updates will require manually running the preprod-to-prod sync. We'll update the ceremony steps to include running the probers manually and kicking off the sync manually.

It looks like a good plan to me.

I can try to fix this with a new workflow to handle a timestamp/snapshot update. A couple of questions I have:

  • When the timestamp/snapshot updates, do we want to sync all files or just the timestamp/snapshot files?
  • Do we care that during a timestamp/snapshot update, we will still also be syncing main to preprod, or should we remove that?

Here's the changes I would recommend:

  1. Change https://github.com/sigstore/root-signing/blob/main/.github/workflows/sync-main-to-preprod.yml to sync main to prod bucket, and only if the changed files are exactly (timestamp.json) or (timestamp.json, snapshot.json, .snapshot.json). We need this check, otherwise the ceremony-to-main GHA will also push to prod.
    • We should still support running this manually, in case the GHA doesn't fire correctly
  2. Change https://github.com/sigstore/root-signing/blob/main/.github/workflows/sync-preprod-to-prod.yml to only fire on workflow dispatch - We should submit this concurrently, because once we stop writing to the preprod bucket, we want to stop syncing from it

For ceremony changes:

  1. Change https://github.com/sigstore/root-signing/blob/main/.github/workflows/sync-ceremony-to-main.yml to push to both the main branch (it's currently doing that) and the preprod bucket.
  2. Update https://github.com/sigstore/root-signing/blob/main/playbooks/ORCHESTRATION.md with these steps

@asraa and @kommendorkapten to double check me

Edit: Reordering steps

There might be a better way to do this with GHA triggers, open to suggestions!

So, something like this?

Content change Ceremony branch update Main branch update Preprod update Prod update
ceremony manual sync-ceremony-to-*.yml on push to ceremony branch sync-ceremony-to-*.yml on push to ceremony branch manual trigger of sync-preprod-to-prod.yml
timestamp OR snapshot ONLY n/a manual or bot n/a sync-main-to-prod.yml on push to main branch
any other content (which may include timestamp or snapshot as well) n/a manual additional workflow? additional workflow triggered manually?

Do we want or need any workflows to take care of syncing "other" changes to the main branch to prod? Is there a circumstance under which that will happen?

The first two rows look correct to me. There should not be any other content that we have to deal with, partially because we don't have the workflows created for it. For example, if we choose to only update targets (let's say because of a compromise), we'll likely just do a full root signing.

@haydentherapper Here's the changes I would recommend:

That looks good, but one question. If we omit preprod from automatic updates, we risk render the preprod bucket invalid as the timestamp and snapshot may expire. I would think we want to sync both prod and preprod when timestamp and/or snapshot are updated. Thoughts?

Yep, that works for me, since we already have probers testing preprod and we don't want those to break. @lkatalin added preprod syncing in c0b5d67