Don't fail publishing when registry repos are behind the upstream
thomashoneyman opened this issue · 3 comments
In purescript/registry#336 (comment) a package failed to commit and push metadata because the local checkout of the registry
repo was behind the upstream by one commit. I don't think this should happen — we should be able to pull to re-sync and then try to push again.
The error in this case arose from the implementation of WriteMetadata
:
registry-dev/app/src/App/Effect/Registry.purs
Lines 250 to 271 in 8177cd3
...but it applies everywhere we use writeCommitPush
, which includes writing metadata, manifests, package sets, and other registry data to the various registry repositories. There's a narrow window to get behind the upstream — you'd have to get behind in the time between the pull
but before the push
in the implementation linked below — but it's certainly possible:
registry-dev/app/src/App/Effect/Registry.purs
Lines 678 to 693 in 8177cd3
To drill down even further, the specific place where this error arises is in the call to Git.gitCommit
. If we are behind the upstream then we bail out of the commit.
registry-dev/app/src/App/CLI/Git.purs
Lines 181 to 200 in 8177cd3
It's OK that we bail out, but we shouldn't bail out of the entire pipeline. Instead, we could bail out of the commit, pull again (with --autostash), and then try to commit again. Only on some n
number of failures to resync with the upstream would it be reasonable to fully terminate the process.
Is this really a problem? I don't think we gain anything in allowing multiple jobs to run concurrently, and we should just limit the amount of running jobs to a single one
It feels odd to arbitrarily limit jobs such that if someone else is uploading a package then I cannot publish mine. Isn’t it preferable to allow concurrent package uploads?
Certainly concurrent jobs are possible so long as the GitHub pipelines exist. We will have to rewrite the GitHub module to send requests to the server instead of executing anything in the action itself if we want control. (I know eventually we want to disable the GitHub API altogether.)
If we don’t want any concurrent operations (anything that touches a shared git resource, at least) then that would mean that the build matrix, daily importer, etc cannot run if a package is currently being published and vice versa. Isn’t that problematic?
Here's another example of this happening:
purescript/registry#344 was terminated because purescript/registry#345 modified the metadata during its run.