Provide guidance for workflows broken by @4?

Question

Provide guidance for workflows broken by @4?

nedbat opened this issue a year ago · 84 comments

What files would you like to change?

I can see that @4 brought improvements in speed and reliability, that's great. But it also broke a number of common patterns in workflows. Can you provide some guidance about ways to achieve the same outcomes with the new version? This might involved new features in actions/download-artifact.

For example:

Upload many different files, then download them to my machine for distribution: https://github.com/nedbat/coveragepy/blob/master/.github/workflows/kit.yml
Produce coverage data files from a matrix of test runs, then upload them to an aggregating step: https://github.com/nedbat/coveragepy/blob/master/.github/workflows/coverage.yml

What are your suggested changes?

I'm looking for help, I don't have solutions yet myself.

Answer 1 · 2023-12-15T14:50:00.000Z

Personally, I think two changes would really help: If there was some way for download-artifact to extract multiple artifacts into a single directory, and if there was some way to use a glob to express the artifacts to download. They could even be together, such as specifying name: coverage-* as the name extracting all matching files into one directory, like setting one name does. Or it could be a toggle, like merge: true/false/auto.

FYI, this affects basically every cibuildwheel user (there are over a thousand), as it was the way you'd merge binaries on different platforms before uploading to PyPI. It also affects the recommendations from the PyPI and Scientific-Python Development Guidelines, etc. It also affects publishing websites that are built in multiple steps (like building WebAssembly components).

FYI, people are merging the v4 updates without realizing the breakage, as Dependabot is making them separately¹, so if they appear broken, they still go with it assuming it's the v3/v4 incompatibility, and if it's only used for the release step (common with cibuildwheel workflows), then they won't even know it's broken until they try to release. I'm currently having to rush a release of cmake for Python to fix a segfault and someone already merged v4 updates, so that pipeline is broken. It would be great if this breaking change could be listed in the dependabot update directly, rather than behind a link, but I assume that's too late.

Happy to update as much as I can to whatever the recommendation is. My current plan (in pypa/cibuildwheel#1699 & scientific-python/cookie#335) is:

    - uses: actions/upload-artifact@v4
      with:
        name: Wheel-${{ matrix.<whatever> }}
...

    - uses: actions/download-artifact@v4
      with:
        path: all

    - name: Merge files
      run: |
        mkdir dist
        mv all/Wheel-*/* .

But besides the extra shell step, will also download everything, even if there's stuff like coverage files that are not needed for that job.

I'm going to start recommending the grouped dependabot updates feature that was recently released soon! That would have helped here. ↩

Answer 2 · 2023-12-15T14:59:53.000Z

Even in the context wider than just uploading to PyPI, when I build a bunch of wheels on different platforms, there should be a way to collect them all for testing (examples: https://github.com/aio-libs/yarl/blob/88900a1/.github/workflows/ci-cd.yml#L208-L265 / https://github.com/aio-libs/frozenlist/blob/b5ffcd7/.github/workflows/ci-cd.yml#L207-L264)

Answer 3 · 2023-12-15T15:05:05.000Z

This also breaks the use of composite actions that themselves make use of either upload-artifact or download-artifact:

I'm curious if the change is an attempt to reduce occurrences of the race condition I reported a while back: actions/toolkit#1235 / actions/download-artifact#140 (comment).

Answer 4 · 2023-12-15T19:19:40.000Z

As an additional datapoint here: GitHub's own docs for publishing to PyPI via OIDC encourage the "upload-then-download" pattern, which (I believe) will be broken by these action changes when used with multiple uploaders (which is common in Python, for builds on different hosts or platforms).

(This has the unfortunate potential to break a large percentage of Python publishing workflows.)

Ref: https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-pypi

Answer 5 · 2023-12-15T21:23:54.000Z

If there was some way for download-artifact to extract multiple artifacts into a single directory

From my experiments while trying to find a way to adapt the hundreds of workflows I maintain that rely on the ability to upload multiple times from matrix jobs to the same artifact for transfer to a dependent job that consumes the content of the artifact, I find the actions/download-artifact@v4 action already has that capability.

For example, a workflow job with these steps will succeed and show that the files from the foo and bar artifacts are present under the baz folder as expected.

- uses: actions/download-artifact@v4
  with:
    name: foo
    path: baz

- uses: actions/download-artifact@v4
  with:
    name: bar
    path: baz

- run: ls baz

If you found some conditions where the action is not able to do this, please do submit a report to the issue tracker for the actions/download-artifact as I agree that it will be a common use pattern for those of us who are migrating from using a single artifact to multiple artifacts for transferring files from one job to another.

if there was some way to use a glob to express the artifacts to download

I agree. This request is already tracked at actions/download-artifact#6

Answer 6 · 2023-12-15T21:32:33.000Z

The problem with repeating the action over and over is that for a normal matrix for coverage or wheel builds, there might now be dozens of artifacts (one per Python version × OS × arch × PyPy/CPython, for example). Also, if one or more jobs fail, you still might want to upload coverage, which means you'd also need to do something like always(). Last I checked I think cibuildwheel could produce about 70+ different wheels if you targeted everything it can handle¹. You can also dynamically generate the matrix, see https://iscinumpy.dev/post/cibuildwheel-2-10-0/#only, which is how MyPyC does it. A normal coverage set is about 20 or so (5 Python versions + 3 PyPy versions × 3 OSs).

I'm absolutely describing "worst case" - often (and for the default cibuildwheel examples), it's just 3 OSs, with all the wheels being built in one job per OS. Plus an SDist job. So 4 artifacts, which isn't terrible listed out. ↩

Answer 7 · 2023-12-17T04:08:51.000Z

(my attempt to summarize the important pain point contexts)

@per1234 your example features a linear workflow with a known static artifact names, but this issue refers to a different practice of CI configuration which involves matrixes.

Note all other people who commented here, combined maintain hundreds of workflows too. What unites us is that we all come from the Python ecosystem and rely on the ability to upload a number of artifacts from jobs in matrixes of varying sizes and different factor naming strategies. We essentially need a way to combine many files coming from different jobs (not necessarily one file per job, though — but usually, all uploaded in one go) in a "synchronization point" — a job that waits for that matrix and does something with those files.

Two major contexts in which this is happening are:

Publishing a bunch of Python distribution packages to PyPI — this has to happen together for all files, once we know all the previous jobs are complete.
Combining coverage data produced in separate jobs into one, for determining if the overall metrics meet the configured threshold.

With different artifact names, we'd need to loop over the matrix artifacts somehow to get all those files together. And GitHub Actions workflows don't have syntax for such loops. Moreover, this means every project reinventing a wheel of naming the workflows differently and figuring out how to extract only matching ones into that sync job.

Answer 8 · 2023-12-17T04:21:45.000Z

I don't have time to experiment with upgrading this action this weekend but I plan to manually download like this https://github.com/WordPress/openverse/blob/00a26a65e93fe8ea2785d4a26c3ebce2efc935db/.github/actions/load-img/action.yml#L26-L35

However, rather than an enumeration I will perform pattern matching after I figure out the APIs.

Answer 9 · 2023-12-17T04:24:53.000Z

I don't have time to experiment with upgrading this action this weekend but I plan to manually download like this https://github.com/WordPress/openverse/blob/00a26a65e93fe8ea2785d4a26c3ebce2efc935db/.github/actions/load-img/action.yml#L26-L35

However, rather than an enumeration I will perform pattern matching after I figure out the APIs.

I also have some hacking ideas with random names, pattern matching and CLI. Working on a composite action based experiment...

Answer 10 · 2023-12-17T06:46:08.000Z

There are mentions of separate upgrades of v3/v4 here breaking and I've had problems in many of my repositories. If anyone is using Renovate and has incompatibility problems, this rule can fix the problem easily (I added it to my user level Renovate config to affect all repos at once.):

"packageRules": [
	{
		"groupName": "actions/upload-artifact and actions/download-artifact",
		"description": "Artifact management is coupled in GitHub Actions. https://github.blog/changelog/2023-12-14-github-actions-artifacts-v4-is-now-generally-available/",
		"matchManagers": ["github-actions"],
		"matchPackageNames": [
			"actions/upload-artifact",
			"actions/download-artifact"
		],
		"matchUpdateTypes": ["major"]
	}
]

Note: this above is really explicit, it could work without matching updateTypes and managers too, if you want to generalize.

Answer 11 · 2023-12-17T09:46:18.000Z

your example features a linear workflow with a known static artifact names

@webknjaz as indicated by the quote in that comment, my example is specifically in response to @henryiii's comment implying that the actions/download-artifact is unable to download multiple artifacts to the same path:

If there was some way for download-artifact to extract multiple artifacts into a single directory

The sole purpose of my example was to supplement my comment that I have found that to be untrue. The actions/download-artifact is able to download multiple artifacts to the same path.

You seem to somehow be interpreting my comment as being intended to dispute the impact of the change of only allowing one upload per artifact. It was not.

Note all other people who commented here, combined maintain hundreds of workflows too. What unites us is that we all come from the Python ecosystem

That is false. I also maintain hundreds of workflows that were broken by this change and these workflows have nothing to do with Python. It is counterproductive to imply that this impact is specific to Python. The impact is much more widespread than that.

Answer 12 · 2023-12-17T13:32:24.000Z

@per1234 I apologize! I meant to augment the context, not dismiss your input. I did indeed misinterpret your comment, though, and felt like it turns the discussion away from the most problematic part of the breaking change. I didn't mean to imply that this problem is only present in our ecosystem, just wanted to give the context of who all those other people besides you.

Answer 13 · 2023-12-18T02:50:53.000Z

@ofek @henryiii @hynek @woodruffw so this is the UX I'm shooting for: https://github.com/webknjaz/re-actors--download-artifact--test/actions/runs/7242013168/workflow#L38-L42. Feel free to lurk and experiment, but don't rely on the Git identifiers — I'll make my usual versioning setup with immutable tags and rolling branches and will probably force-push the current thing a couple of times later today, before declaring it ready. It gets artifacts matching the wildcard into a temporary directory and then moves the underlying dirs into the user-requested path.

I'll also rename the action repo — originally, I thought of it as a wrapper for the actions/download-artifact action but I ended up not calling it and calling GH CLI directly instead. Keeping it as is would be misleading. Still, it'll keep using the original action's inputs list (with the same names and defaults). With that in mind, I was thinking of something like "fuse" / "meld" / "merge" / "combine" / "squash" and/or maybe "group" in the name. E.g. re-actors/meld-artifact-group.
Accepting the naming ideas until I'm finished with the polishing. Also, need a name for the marketplace thing that may be different (but not necessarily).

Answer 14 · 2023-12-18T13:26:07.000Z

Not sure how useful or directly applicable this is to OP's problem, but I worked around this in a project I help maintain using the gh cli tool and some reasonably simple bash logic here: snapcrafters/ci#24.

Hopefully useful to some folks 😄

Answer 15 · 2023-12-18T20:34:16.000Z

👋 Hey everyone!

We appreciate all the feedback you’ve shared so far about upload/download artifact@v4. We’re excited about this release as it includes many improvements, like considerably faster upload/download speeds and the immediate availability of uploaded artifacts. In order to achieve these performance gains, we decided to fundamentally rewrite Artifacts from the ground up, which we’ll share more about in a future blog post.

There was some concern expressed about the immutability of an artifact once uploaded. Previously, each file within an Artifact was individually uploaded to a proxy service before landing in blob storage. This approach allowed actions/upload-artifact to be called on the same object multiple times but it also adversely impacted performance and risked content corruption during concurrent uploads to the same artifact. Artifacts in v4 are individually sealed archives (zips) that are assembled on the runner and then uploaded directly to blob storage. We believed this change was worth the tradeoff for significant performance gains and to protect the integrity of uploaded artifacts.

We'll be making an update to actions/download-artifact by end of day which will include a solution similar to others proposed in this issue:

The download-artifact action will have a new pattern: input that will filter the downloaded Artifacts. If you have multiple Artifacts being uploaded in a matrix scenario, you can suffix with other identifiers e.g. my-artifact-${{ matrix.runs-on }}, and downloaded them all at once with as pattern: my-artifact-*
In addition, we’ll have another merge-multiple boolean input that will extract all the Artifacts’ content to the same destination directory.

 - name: Download Matching Artifacts
     uses: actions/download-artifact@v4
     with:
       pattern: my-artifact-*
       path: my-artifact
       merge-multiple: true
 - run: ls -R my-artifact
   # file-macos-latest.txt
   # file-ubuntu-latest.txt
   # file-windows-latest.txt

Thank you for your engagement with Artifacts and your valuable feedback.

Answer 16 · 2023-12-18T21:01:26.000Z

Not sure how useful or directly applicable this is to OP's problem, but I worked around this in a project I help maintain using the gh cli tool and some reasonably simple bash logic here: snapcrafters/ci#24.

Hopefully useful to some folks 😄

That's basically what I've been attempting in my action experiment — GH CLI + passing name to --pattern + auto-merging. Been trying to keep the UX as close to the v3 as possible, because adding new arguments means more changes for the end-users, which is rather annoying on the scale.

Answer 17 · 2023-12-18T21:51:31.000Z

@andrewakim Thank you for the update. Do you have any suggestions/guidance for workarounds for the 10 file limit per workflow? Or should we be using a different action for the following use cases:

A matrix workflow fans-out to 60+ jobs. Each uploads 2 uniquely named artifacts.
A matrix workflow fans-out to 60+ jobs. Each uploads 1 uniquely named artifact. A final aggregation step merges the test results into a single report.

Answer 18 · 2023-12-18T21:57:46.000Z

@schmidtw the readme says:

Each job in a workflow run now has a limit of 10 artifacts.

Does a job with matrix: strategy count as one "job" for you? It works for me: https://github.com/TWiStErRob/net.twisterrob.gradle/actions/runs/7231579385

Answer 19 · 2023-12-18T22:55:23.000Z

👋 The changes @andrewakim stated are now released on the v4 branch.

You can find the new inputs documented in the README:

https://github.com/actions/download-artifact#inputs

As well as a new migration document outlining this use case:

https://github.com/actions/download-artifact/blob/main/docs/MIGRATION.md

Thanks again for your feedback!

Answer 20 · 2023-12-18T23:17:33.000Z

@robherley I haven't tried it yet, but this looks great. A link to the migration doc from the readme of this repo will help a lot. The first sign many people will see that indicated a needed migration will be an error from upload-artifact. The breaking change notice in the readme is helpful, and a link to the migration doc will be even better. Thanks!

Answer 21 · 2023-12-18T23:26:41.000Z

Thank you for implementing and releasing this, @robherley; it's been a long-awaited change! It's amazing to see the activity on these actions 🤩.

I've tested the the PR branch before the reviews and it worked well. Please consider reviewing the issues linked in actions/download-artifact#259 (comment) and updating/closing those threads too if relevant.

Answer 22 · 2023-12-18T23:57:33.000Z

@robherley the example in your PR demonstrates name: my-artifact-${{ matrix.runs-on }} as an example for naming different artifacts differently. In some cases, that's not enough to guarantee uniqueness. For example, with several matrixes adding to the same pattern or the use of extra include: entries. A solution could be to have that suffix generated randomly (with something like python3 -c 'import uuid; print(uuid.uuid4())' or python3 -c 'import secrets; print(secrets.token_hex(16))').
I think the upload action migration doc may need better examples with this, and maybe even some kind of a first-class support through a new input.

Answer 23 · 2023-12-19T00:07:13.000Z

@robherley do you know if common directory layout subpaths are merged seamlessly? I don't see examples or tests for that in the PR.

Answer 24 · 2023-12-19T01:13:07.000Z

I think we should really get rid of the 10 artifact limit. ASFAIK, there's no real reason to have it (even though most have less which isn't a reason).

Answer 25 · 2023-12-19T02:03:51.000Z

@schmidtw the readme says:

Each job in a workflow run now has a limit of 10 artifacts.

Does a job with matrix: strategy count as one "job" for you? It works for me: https://github.com/TWiStErRob/net.twisterrob.gradle/actions/runs/7231579385

Thank you for pointing that out. I read it a couple of times, & for whatever reason my brain replaced "job" with "workflow". 10 artifacts per job is perfectly fine.

Answer 26 · 2023-12-19T02:31:48.000Z

FTR here's the guidance I'm giving in one of the shared maintenance orgs I overlook: https://github.com/orgs/aio-libs/discussions/31#discussioncomment-7891886

Answer 27 · 2023-12-19T09:23:54.000Z

I've just been bitten too. It's a bad experience for the user that the breaking changes are not mentioned at all in the release page https://github.com/actions/upload-artifact/releases. The corollary is that the breaking changes are not mentioned either in @dependabot's PRs.

Regarding the fix, using ${{ matrix.whatever }} in the artifact name is not a very general solution. Some workflows have matrices with several dimensions, and some matrix values are not adapted for an artifact name(they can contain forbidden characters).
I think that a best solution is to use ${{ github.job }}-${{ strategy.job-index }}

Answer 28 · 2023-12-19T14:42:11.000Z

Regarding the fix, using ${{ matrix.whatever }} in the artifact name is not a very general solution. Some workflows have matrices with several dimensions, and some matrix values are not adapted for an artifact name(they can contain forbidden characters).
I think that a best solution is to use ${{ github.job }}-${{ strategy.job-index }}

Is the job id unique across re-runs? Should the attempt id be included?

Answer 29 · 2023-12-19T14:55:29.000Z

Wow, I didn't realize that the release page didn't mention the breaking change...

@andrewakim Thanks for helping get the updated download functionality in place, but as a bit of feedback: your comment starts with two long paragraphs about how great v4 is, waiting until the third paragraph to address our concerns. This underscores the general feeling of being marketed a product that doesn't do what we need, rather than having peer-to-peer engineering discussions about how to solve our problems.

Answer 30 · 2023-12-19T15:18:57.000Z

That's not at all how I interpreted it as the first two paragraphs include important context about the implementation that explain why the change was made. Until then, I did not really know how it worked behind the scenes.

Answer 31 · 2023-12-19T15:25:25.000Z

I agree it's useful context that should have gone into the release notes when it went out, instead of everyone being notified by dependabot and seeing things were broken without a clear explanation.

Answer 32 · 2023-12-19T15:33:35.000Z

I agree it's useful context that should have gone into the release notes when it went out, instead of everyone being notified by dependabot and seeing things were broken without a clear explanation.

There's a more dangerous side of this: some projects wouldn't even be notified in PRs when the download action isn't run unconditionally, but is gated by a release request. It'd be quite unpleasant to get the news months after that upgrade got merged, in the middle of a release process.

Knowing this, I searched through the PyPA repos the other day and found an actual thing that would break in this exact manner because the updated workflow that got the update is triggered manually: pypa/pipx#1147 (comment).
And I'm also tracking the recommendations within the @aio-libs org @ https://github.com/orgs/aio-libs/discussions/31 to provide guidance to the maintainers of different repos under the umbrella.

Answer 33 · 2023-12-19T15:43:14.000Z

@nedbat Now that the work has been done, is it fine to close this issue? #472 (comment)

Answer 34 · 2023-12-19T16:02:28.000Z

@ofek I think we still need to have a clear example/understanding of how to make unique artifact names with multiple matrixes and multiple reruns. So maybe, we'd wait for more answers?

${{ github.job }} proposed by @ericLemanissier, contains a job id which is essentially a YAML key. It's unique within one workflow, but is the same when reruns happen. ${{ strategy.job-index }} is a job number within a matrix, where it's unique, but job indeces will have the same numbers on rerun. So if a previous (even failing) job succeeded uploading an artifact, its rerun will just error out with no fix w/o re-triggering the entire thing with a separate commit or something.

Answer 35 · 2023-12-19T16:28:40.000Z

the best is probably to upload artifacts with name: prefix-${{ github.run_attempt }}-${{ github.job }}-${{ strategy.job-index }} and download with pattern: prefix-${{ github.run_attempt }}-*

Answer 36 · 2023-12-19T16:31:27.000Z

and download with pattern: prefix-${{ github.run_attempt }}-*

Wouldn't different jobs have different run_attempts if I only rerun a few jobs, but not all?

Answer 37 · 2023-12-19T16:35:52.000Z

and download with pattern: prefix-${{ github.run_attempt }}-*

Wouldn't different jobs have different run_attempts if I only rerun a few jobs, but not all?

Perhaps, we need timestamps at the beginning of the pattern, to ensure that the downloading and merging happens in the predictable manner. Though, it's not clearly documented (or tested) if files in newer artifacts would override the older ones.

Answer 38 · 2023-12-19T17:24:09.000Z

Actually, ignore my last comment. re-running a job seems to delete artifacts uploaded by said job, because it is able to re-upload an artifact with the same name. In the general case (if you job name has no forbidden characters), the solution is to upload artifacts with name: prefix-${{ github.job }}-${{ strategy.job-index }} and download with pattern: prefix-*

Answer 39 · 2023-12-19T21:26:56.000Z

Actually, ignore my last comment. re-running a job seems to delete artifacts uploaded by said job, because it is able to re-upload an artifact with the same name. In the general case (if you job name has no forbidden characters), the solution is to upload artifacts with name: prefix-${{ github.job }}-${{ strategy.job-index }} and download with pattern: prefix-*

This feels like a regression rather than an intentional breaking change, and it doesn't feel like it should be on the user to have to work around it by adding uniqueness.

In our case, the latest run doesn't have any uploaded artifacts:
https://github.com/Azure/bicep/actions/runs/7252241798/attempts/4

Whereas an older run does:
https://github.com/Azure/bicep/actions/runs/7252241798/attempts/3

The GitHub UI is also busted - for the older run (see my address bar):

Yet there are artifacts:

For the newer run, a number is displayed:

Yet no artifacts are displayed.

Answer 40 · 2023-12-20T03:42:40.000Z

Since matrix jobs don't support dynamic output (lol), I have been using an artifact as a shared store for matrix job outputs. Single artifact with different files inside of it. Now this workaround has been broken by this.
Time to find a workaround for my workaround I guess..

Answer 41 · 2023-12-20T04:03:31.000Z

Matrix jobs certainly do support outputs, I do it here: https://github.com/pypa/hatch/blob/hatch-v1.9.0/.github/workflows/build-hatch.yml#L84-L85

Answer 42 · 2023-12-20T04:09:41.000Z

Matrix jobs certainly do support outputs, I do it here: https://github.com/pypa/hatch/blob/hatch-v1.9.0/.github/workflows/build-hatch.yml#L84-L85

But is that matrix output dynamic?

Edit: I'll add dynamic for clarity.

Answer 43 · 2023-12-20T10:22:01.000Z

Actually, ignore my last comment. re-running a job seems to delete artifacts uploaded by said job, because it is able to re-upload an artifact with the same name. In the general case (if you job name has no forbidden characters), the solution is to upload artifacts with name: prefix-${{ github.job }}-${{ strategy.job-index }} and download with pattern: prefix-*

Thank you for checking 🙏

Answer 44 · 2023-12-20T10:43:32.000Z

Matrix jobs certainly do support outputs, I do it here: https://github.com/pypa/hatch/blob/hatch-v1.9.0/.github/workflows/build-hatch.yml#L84-L85

@ofek all individual matrix jobs likely have the same version in your case, but if each job were to output different values, this value would be overwritten by the last finishing job in the matrix. That's the main problem here.

But is that matrix output dynamic?

Edit: I'll add dynamic for clarity.

@shinebayar-g I was googling wrt the current issue the other day and stumbled upon an issue or PR that implements templating for the output keys. I can't find it again now, but the docs confirm that it's possible:

You can use the matrix context to create unique output names for each job configuration.

(https://docs.github.com/en/actions/using-jobs/defining-outputs-for-jobs#overview)

This means that you can have

...
  matrix-job:
    matrix:
      thing:
      - ...
      - ...
      - ...
    outputs:
      matrix-output-${{ matrix.thing }}: ${{ steps.matrix-step.outputs.variable }}
    ...
...

This likely means that you'd have to inspect the needs.matrix-job.outputs context dynamically later on, or reuse your matrix definition somehow, to access said outputs.

Answer 45 · 2023-12-20T15:00:36.000Z

I'm a bit lost on how dynamic matrices work and how they affect this because I don't use them.

My matrices are working well using the guidance in the migrating documentation: nedbat/coveragepy@9bda95d

Answer 46 · 2023-12-20T23:13:17.000Z

I think "dynamic" means different output from each job above. There's also truly dynamic matrices that are generated from a step, like in https://iscinumpy.dev/post/cibuildwheel-2-10-0/#only.

I'm updating the scientific-python development guide to use the suggestions here, and sp-repo-review will soon be able to detect and flag repos with workflows that need updating for v4. Those should go live in a day or two if all goes well. We have just merged updated documentation for cibuildwheel.

Thanks for providing a good path forward!

Answer 47 · 2023-12-21T13:37:48.000Z

Do we have even a single person working from github working on upload-artifact engaged with this? I see over 50 people and more than 10 issues raised about how breaking is v4 but zero feedback from someone with "member" badge. Was v4 released on a Friday evening before going into a long vacation? ;)

Meanwhile dependabot is proposing version updates that would continue to break our projects, sometimes in a very hard to spot manner.

Answer 48 · 2023-12-21T13:41:48.000Z

Do we have even a single person working from github working on upload-artifact engaged with this?

@andrewakim and @robherley are marked as "Staff", and have made changes. There are many comments here, so it's easy to get lost in them, but it's not like nothing is happening.

Answer 49 · 2023-12-21T13:46:56.000Z

Thanks for pointing, yep I tried to spot some and fail among the rest of the comments. If something is baking is good. We will tame dependabot until we have a fix. I also had to do few rollbacks because I did not realise that v4 was not really working... with coverage.

Answer 50 · 2023-12-21T13:48:22.000Z

There are many comments here, so it's easy to get lost in them, but it's not like nothing is happening.

A link to the two comments in this thread if anyone is having a hard time finding them: #472 (comment), #472 (comment)

Answer 51 · 2023-12-23T21:39:53.000Z

It's rather simple:

Changes are not backward compatible
Changes offer no visible benefit over v3 implementation, AFAICS (that may be my blind spot though)
Breaks many/all (matrix) builds (using Arduino/C++ here, not Python)
No migration documentation

Conclusion: Must be a wrong/poor implementation, so no reason whatsoever to implement, and this change should be reverted.

NB: Exactly the same feelings for the actions/download-artifact v4 release

Answer 52 · 2023-12-23T22:00:39.000Z

@tonhuisman did you miss that they added the migration documentation as requested? Did you overlook the explanation of the benefits?

Answer 53 · 2023-12-23T22:57:47.000Z

Changes offer no visible benefit over v3 implementation, AFAICS (that may be my blind spot though)

I have seen some reports that v4 is significantly faster in some cases.
Also, the uploaded artifacts are available immediately rather than after all jobs have finished, which is quite a significant improvement IMO.
They've also mentioned that these changes will enable them to finally implement long requested UI improvements; we'll have to wait and see on this one.

Breaking the merging capability is unfortunate, but at least they've provided an OK workaround for merging while downloading with download-artifact.
I would like to also see an option for downloading multiple artifacts at once from the summary page, hopefully this will be added at some point.

Answer 54 · 2023-12-27T10:29:27.000Z

Ps. Sos

…

On Wed, Dec 20, 2023, 6:13 PM Henry Schreiner ***@***.***> wrote: I think "dynamic" means different output from each job above. There's also truly dynamic matrices that are generated from a step, like in https://iscinumpy.dev/post/cibuildwheel-2-10-0/#only. I'm updating the scientific-python development guide <https://learn.scientific-python.org/development/> to use the suggestions here, and sp-repo-review will soon be able to detect and flag repos with workflows that need updating for v4. Those should go live in a day or two if all goes well. We have just merged updated documentation for cibuildwheel. Thanks for providing a good path forward! — Reply to this email directly, view it on GitHub <#472 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BEGNTMWCJWUGKHXZRHA64CTYKNWKXAVCNFSM6AAAAABAWRHSKOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRVGI3DEMZQHA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Answer 55 · 2023-12-27T11:13:53.000Z

@tonhuisman did you miss that they added the migration documentation as requested? Did you overlook the explanation of the benefits?

@nedbat Thanks, the migration documentation, though somewhat hidden, I have found by now (and implemented).

About the benefits:

We're using a matrix build that produces ~160 artifacts, with usually 2 files per artifact (we are working on reducing the matrix size but for other reasons, these are PlatformIO ESP builds for a couple (8) of ESP types and several HW configurations). Faster uploads don't really attribute much in reducing the build time here, as that's about 1% of the build time...
When downloading the matrix-artifacts (for/by combining related artifacts into 8 archives for easier distribution) using actions/download-artifact@v4 I often see connection issues, breaking the build... 😞 I've switched to trying to assemble into a single archive, and still have the separate artifacts available, but that doesn't solve the download-issues, of course.

About my expectations:

Improving upload speed should be achievable and still be backward compatible. Backward compatibility is king! (Yes, I've been in software development 'for some time', like 35+ years 😃, it's important), though I've seen many things break over the last couple of years, since we started using GH Actions (but also in PlatformIO)

Answer 56 · 2023-12-27T14:20:39.000Z

Improving upload speed should be achievable and still be backward compatible. Backward compatibility is king!

GitHub actions use semantic versioning and require an explicit tag or Git reference so introducing a breaking change in a major release is totally standard since you are the one controlling the version. Actions should be treated just like dependencies, there is no difference.

Answer 57 · 2023-12-27T14:26:11.000Z

since you are the one controlling the version

Yes, but it would keep me stuck on that older version until they are no longer usable (Node16 vs Node20? once Node16 goes out of support...) or I'm ready to upgrade my scripts to be compatible again, requiring quite some work (mostly testing, but a lot of code-tinkering first)

Answer 58 · 2023-12-27T14:45:45.000Z

What you speak of is no different than a major version upgrade of a dependency in any ecosystem. Are you then advocating that there should be no breaking changes even between major versions?

Answer 59 · 2023-12-27T15:16:22.000Z

I've said it earlier, but I'll repeat it because nothing happened:
As for all major version upgrade with breaking changes, the breaking changes have to be mentioned in the release itself, along with a link to the migration guide. I cannot see either in https://github.com/actions/upload-artifact/releases/tag/v4.0.0

Answer 60 · 2023-12-27T19:47:38.000Z

@ericLemanissier 💯 on your point. While there was technically a major version** here what was obviously missed here was "guidance". Both in the form if "if you used to do x, now do y" and also clearly communicating the breaking changes in the release notes themselves. The fact that dependabot created PRs doesn't help at all here. We could have also done a better -preview release with garnering feedback from customers looking for the speed / reliability improvements. Some scenarios were obviously missed. The team is discussing all of these, working hard to close the gaps and will do better going forward.

Thank you so much for the feedback.

** technical note of compatibility discussed in docs - however no excuse for problems above ☝

Yes, but it would keep me stuck on that older version until they are no longer usable (Node16 vs Node20? once Node16 goes out of support...)

@tonhuisman We will maintain v3 along with backward compatibility. IF it's ever sunsetted it will go through the long deprecation / announce cycle. v3 should get the proper support including node end of lifes etc. as long as it's supported. That said, we would hope that 10x++ perf gains, reliability, and the artifact being immediately available would be a reason to move over and update your yaml (with migration guidance). If there's a scenario where it's not possible to do something with v4 that was possible with v3, we would like to know.

On the note of compatibility. Note that it's more than just actions yaml input compatibility. The model and behavior is fundamentally different. Instead of an artifact "namespace" of files that don't get sealed until the end of the run it's one that's sealed per job and therefore can be zipped and streamed up and available immediately. It also has different expectations to the behaviors and format on the download side. the net result is 10x++ perf gains for the 95%+ case but ideally possible and hopefully still faster even in the minor case. The model is so different that we even discussed whether it makes sense to create another action but that has another downside of the market place getting fractured. In the end, since we guarentee major version compatibility, it was decided that we will hold compat 100% with v3 and introduce a new model with v4. As mentioned above the guidance and rollout was the problem. Hope all that makes sense.

Answer 61 · 2023-12-27T19:57:11.000Z

@bryanmacfarlane As an action item you should probably do what others have mentioned and add the migration guide to the text of the v4 release.

Answer 62 · 2023-12-27T20:18:44.000Z

@ofek 💯 discussed that with the team.

Answer 63 · 2023-12-27T21:19:46.000Z

@bryanmacfarlane Thank you for your extensive explanation.

we would hope that 10x++ perf gains, reliability, and the artifact being immediately available would be a reason to move over and update your yaml (with migration guidance).

Well, possibly the fact we have ca. 160 artifacts from a matrix build (not really sure if that's many, but most projects have a lot less), and trying to action/download-artifact@v4 them in a single step may be the issue leading to connection failures, or maybe just a GH service that needs to be scaled up a bit (it does sometimes succeed), it's the having to re-engineer something that was working OK that makes all this itch a bit.
We're trying to keep stuff up to date, but maybe an alternative could be to go for a Node20 compatible release of the v3 actions?

Answer 64 · 2023-12-28T00:25:05.000Z

@tonhuisman in the mean time, v3 is still 100% compatible and works as is before v4 was added cumulatively. service side is still intact 100% unchanged for that flow as well.

The team should understand your scenario better. Maybe create another issue with the scenario and what you're experiencing? Would probably be good to keep this issue focused on the guidance. v4 should theoretically use less connections (and be more reliable) since it does it per artifact where v3 went through many requests and multiple hops per file in an artifact (something like node/js artifacts with tons of small files being the worst). But perhaps there's a problematic scenario?

@robherley

Answer 65 · 2024-01-02T21:32:32.000Z

@bryanmacfarlane FYI what @tonhuisman describes sounds like this old issue I reported a while back that is now more likely to become visible than before: actions/toolkit#1235

Answer 66 · 2024-01-02T22:14:38.000Z

@bryanmacfarlane FYI what @tonhuisman describes sounds like this old issue I reported a while back that is now more likely to become visible than before: actions/toolkit#1235

I've been able to work around that 'download failing' issue by using this wretry.action action for now.

Answer 67 · 2024-01-18T05:12:32.000Z

It would be nice if we could maybe upload artifacts either to folders, or just some way to group them visually by headings or something like that?

I don’t mind uploading all the artifacts as separate zips — like most people in this thread we have 100+ wide matrix builds — but please add a way to group related output together.

A useful UX for this would be a “Download All” button for each group to avoid needing an another GH action just to download them all and reupload them in the same zip file.

IMO uploading and downloading anything in bulk from GitHub is already noticeably flaky because of the API rate limiter and it looks like the suggestion in this thread is to make it a regular part of every pipeline due to v4 breaking changes? Sounds like a good way to make every single pipeline flaky, especially at companies that have 50+ self hosted runner boxes hitting the API from the same IP like ours

Answer 68 · 2024-01-26T03:08:55.000Z

@bryanmacfarlane I'm going to reply to #472 (comment)

@tonhuisman We will maintain v3 along with backward compatibility. IF it's ever sunsetted it will go through the long deprecation / announce cycle. v3 should get the proper support including node end of lifes etc. as long as it's supported.

That did not happen for any practical interpretation. It's being killed within a month of your comment. That is not a lifecycle. The software you're asking us to use still has unreleased bug fixes that mean it doesn't work properly. The README at the root of these repositories starts with Improvements. It should not it should start with Breaking Changes. It also shouldn't favor self hosted runners [sic] it should favor things that are more commonly used, and that's clearly artifact name collisions (and I'm sure your team knows that).

There is a single sentence:

For assistance with breaking changes, see MIGRATION.md.

But, it's really buried and it isn't called out. You could even use the new GitHub markdown syntax to call it out.

Beyond that, this thing isn't supported by GHES which means that if an action author (that's me) actually needed to support GHES customers as well as people using FPT then we'd be screwed. That's really mean. (I don't know that my action is/isn't used by GHES customers, but it's definitely used by enterprises.)

Your teams messed up repeatedly and made it worse here. Fwiw, when my team messes up, we create a new major version that backs out our thing and go back to the drawing board and try to be much more careful about our next rollout. Sure it means that we have much higher major versions in some of our modules, but versions are relatively free. Browser vendors are well over 100 and it turns out that users don't mind.

The release notes for v4.3.0 do not remind people that v4 is incompatible with v3. I think at this point the first line of every v4.x release should be "v4.* is incompatible with v3, please see MIGRATION.md. -- As noted in #472 (comment)

actions/* is using @v (n+1) releases for two purposes:

when the actions runtime changes
when an action's api surface changes

GitHub actions teams apparently committed not to change the actions runtime within a major version, so much so that github/codeql-action messed up and merged a node version update to v2 and then undid it.

If you're committed not to change a node runtime within a version, then you should not have done an api break in the action.yml surface using n+1, you should have used floor(n/10)+11 instead.

That would have resulted in this release being actions/upload-artifact@v11 and actions/download-artifact@v11 and given you space to release an actions/upload-artifact@v4 and actions/download-artifact@v4 for the node runtime change.

The model is so different that we even discussed whether it makes sense to create another action but that has another downside of the market place getting fractured.

Yes, that would have been the correct choice. You could easily have created a proper deprecation system where the old action slowed down/warned over time while still allowing it to get runtime updates.

It would have also allowed people to write a more useful migration tool to actually try to migrate users instead of letting dependabot make a mess of things by making PRs that did not work.

Answer 69 · 2024-01-26T20:31:36.000Z

@bryanmacfarlane FYI what @tonhuisman describes sounds like this old issue I reported a while back that is now more likely to become visible than before: actions/toolkit#1235

I've been able to work around that 'download failing' issue by using this wretry.action action for now.

Unfortunately, that action also has a Node 16 dependency, so now gets a deprecated notification too 😞
I'm removing that, hoping (praying?) that previous bulk-artifact download issues have actually been resolved... 🙏

Edit: Ah, using the @master 'version' should fix that. No proper versioning/releasing there.

Answer 70 · 2024-01-26T20:44:41.000Z

Fwiw, for my action, I'm moving to use gh run download instead of actions/download-artifact@v4.

A feature of actions/upload-artifact@v4 is that the uploads are available as soon as they're made which means that gh run download can retrieve them within a workflow run whereas that wasn't possible for actions/upload-artifact@v3 things. (From my perspective, actions/download-artifact@v3 appeared to be using a private api to look under the hood and access artifacts before they were generally available.)

Answer 71 · 2024-01-26T22:06:15.000Z

Using the plain actions/download-artifact@v4 now, I'm seeing this runtime warning:

(node:1805) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
(Use `node --trace-deprecation ...` to show where the warning was created)

I guess there is still some work to do for the download-artifact action 🤔

On the positive side, it's all downloading fine 👍

Answer 72 · 2024-01-27T23:40:55.000Z

Yes, that's tracked as:

actions/download-artifact#283

Answer 73 · 2024-02-02T19:42:53.000Z

i was following the https://github.com/actions/download-artifact/blob/main/docs/MIGRATION.md, which talks about setting the artifact name to something like my-artifact-${{ matrix.runs-on }}. this won't actually work, right? this results in artifact names like "my-artifact-" for me, which seems to imply matrix.runs-on is not a thing you can call in this way.

Answer 74 · 2024-02-02T19:44:26.000Z

@burnettk It's an example, use whatever your matrix inputs are.

Answer 75 · 2024-02-02T19:45:20.000Z

ok, fair enough, thanks.

Answer 76 · 2024-02-02T19:49:52.000Z

That works because the matrix in the example has a runs-on key. You need to set it to your key(s), or you can use the less-descriptive but universal strategy.job-index, which works for any matrix but just uses numbers.

Answer 77 · 2024-02-02T19:50:49.000Z

(https://learn.scientific-python.org/development/guides/repo-review/ will even suggest this if you put in a repo that would fail with v4)

Answer 78 · 2024-02-02T20:27:53.000Z

#472 (comment) feedback is valid.

Conceptually, here's a change to (hopefully) improve that experience:

 In v4, Artifacts are immutable (unless deleted).
-So you must change each of the uploaded Artifacts to have a different name
+So you give each uploaded Artifact a distinct name
+(in this example the matrix has a `.runs-on` property which is used to provide a distinct name,
+but how you choose a distinct name will vary)
 and filter the downloads by name to achieve the same effect:

Answer 79 · 2024-02-12T16:01:39.000Z

In v4, Artifacts are immutable. So you must change each of the uploaded Artifacts to have a different name:

name: Upload unit tests results
  if: always()
- uses: actions/upload-artifact@v3
+ uses: actions/upload-artifact@v4
  with:
-     name: test-results
+     name: test-results-unit
          path: |
          */build/test-results/unit
name: Upload instrumentation tests results
  if: always()
- uses: actions/upload-artifact@v3
+ uses: actions/upload-artifact@v4
  with:
-     name: test-results
+     name: test-results-instrumentation
          path: |
          */build/test-results/instrumentation

And then you can merge it:

merge:
  runs-on: ubuntu-latest
  needs:
    - android-tests
    - unit-tests
  steps:
    - name: Merge Artifacts
      uses: actions/upload-artifact/merge@v4
      with:
        name: test-results
        pattern: test-results-*

You can see full PR here: https://github.com/appunite/Loudius/pull/190/files

Answer 80 · 2024-03-09T20:59:18.000Z

Version 4 has been ultimately a downgrade for me, as it removes the ability to replace archives. I have a job that runs in my workflow for only certain architectures that does apple code signing. In this case, it downloads a artifact (which contains a binary), signs it, and then re-uploads the artifact, replacing the original. But I cannot do this with v4. The merge action does not address this use case either.

My artifacts are named after architectures in a matrix build; for example: linux-x64, win-x64, osx-x64. It's important that the names remain the same, even if they get replaced, because my final release job iterates the artifacts and attaches them to a Github Release. So it needs to be able to find osx-x64, not something like osx-x64-signed, because I won't have equivalent versions for linux, for example (e.g. linux-x64-signed doesn't make sense). The name the artifacts start with is the name they have to end with.

Immutable packages don't need to change, but we should have the ability to replace them.

Answer 81 · 2024-03-09T21:09:26.000Z

The merge action does not address this use case either.

I just successfully upgraded to v4 a few days ago where I do the exact same thing that you're talking about... pypa/hatch@03b1bca#diff-c4330c2cfbadba3e17aef4105738e5644a48e08e86096a32e2a6625fcd83b957

Answer 82 · 2024-03-09T21:21:02.000Z

The merge action does not address this use case either.

I just successfully upgraded to v4 a few days ago where I do the exact same thing that you're talking about... pypa/hatch@03b1bca#diff-c4330c2cfbadba3e17aef4105738e5644a48e08e86096a32e2a6625fcd83b957

Your workflow file appears to be structured differently than mine. Here is how I'm doing things:
https://github.com/recyclarr/recyclarr/blob/master/.github/workflows/build.yml

Taking a peek at your workflow, it looks like you make staged- artifacts for every platform, including macos. However in my case, I do not have additional preparation work to do, so technically only macos needs staging artifacts. Upgrading to v4 shouldn't force me to create unnecessary staging/temporary artifacts when in v3 a simple "reupload" was sufficient for the special cases.

Answer 83 · 2024-03-10T00:59:28.000Z

@rcdailey there's now a overwrite: true option...

Answer 84 · 2024-03-10T01:18:29.000Z

@rcdailey there's now a overwrite: true option...

How embarrassing; I completely missed this in the documentation. Thank you very much for pointing it out. I tested with this setting and it works great!

What files would you like to change?

What are your suggested changes?

Footnotes

Footnotes