sigstore/root-signing

Timestamp update did not trigger main to preprod sync

Closed this issue ยท 25 comments

#884 did not trigger a main to preprod sync. This is strange because:

  • #884 updates timestamp.json, the path for which is https://github.com/sigstore/root-signing/.../repository/repository/timestamp.json
  • The sync main to preprod file defines that a sync should occur when any file under repository/repository is changed
asraa commented

The reason may be because a bot merged the PR: #884

and usually GitHub Actions prevents Actions from triggering Actions (to prevent automatic workflow recursion)

See this related issue: #507

Oh, I didn't know that! Thank you @asraa , this may be the problem.

@asraa, I have configured the workflow to use the secret -

Do you have any guesses why it merged with the bot rather than the PAT?

Could it be that a PAT can't be reused with a workflow created after the PAT's date?

@cpanato any guesses?

asraa commented

That's the PAT that creates the PR, not merges it:

This argument is passed, unchanged, to the job that creates the pull request.

so I don't think that is a reason for why this is happening.

asraa commented

It's true that the review workflow is using the token passed in too, but is GITHUB_TOKEN override-able? Maybe it should just be TOKEN.

I'm guessing that's the problem. GITHUB_ vars are usually not override-able.

FWIW, #880 looks identical to #884, in that a bot merged the PR

asraa commented

To add the logging for the PAT attribution, maybe the scripts can each gh auth login. I'm not sure though. Otherwise, it the PRs look identical. It just matters what underlying PAT was used.

asraa commented

FWIW, #880 looks identical to #884, in that a bot merged the PR

I haven't seen a commit trigger the sync job since a month ago: https://github.com/sigstore/root-signing/actions/workflows/sync-preprod-to-prod.yml?page=9 (and that wasn't a review bot commit)

It was this workflow that didn't trigger - https://github.com/sigstore/root-signing/actions/workflows/sync-main-to-preprod.yml

Which seems to say that it is running after a commit is pushed by the bot

asraa commented

Hmmm maybe there was an underlying GitHub Actions change?

One other weird thing is I see no checkmark next to the PR after merge like I normally do:

Screenshot 2023-07-24 at 8 38 39 AM

I wonder if it was simply a GHA temporary failure. Let's see if today's snapshot/timestamp go through smoothly, and I'll kick off another timestamp-only generation after that.

I think a simple fix is to run the sync main to preprod periodically too.

asraa commented

Yeah, i wonder if GHA was down at the time...

Chatted with @bobcallaway, plan is:

  • Wait to see if the snapshot/timestamp PR gets merged and the sync to preprod kicks off
  • I'll manually kick off the workflow for timestamp only, wait for the PR to get merged and see if preprod kicks off
  • If it doesn't, we'll regenerate the CI tokens and update the secrets in this repo and try again
  • If that doesn't work...ask someone from GH?

^ Whoops, that's funny, I guess the issue title matched the regex

No issue with the latest run - https://github.com/sigstore/root-signing/actions/runs/5649273862

My guess is it was a random GHA failure.

No issue with the latest run - https://github.com/sigstore/root-signing/actions/runs/5649273862

My guess is it was a random GHA failure.

For my edification, when I look at the main to preprod workflows, it looks like the ones triggered automatically all say "Update Snapshot and Timestamp" (referring to PRs on this repo merged by the bot), but the PRs we weren't sure are triggering the sync are Timestamp-only. I don't see those in the sync workflows. What am I missing or misreading?

if we use a the default injected github token that will not trigger anything, we need to use a PAT for that
but looks like that was not the reason :)

We are experiencing this issue again! Re-opening!

I read the executed workflows slightly wrong and thought this issue was at fault again, but in today's incident, the sync did occur. The actual problem was that we need to sync more frequently (see #893 ). I think we can close this issue again.