actions/upload-artifact

[bug] (v4) Unable to upload to same artifact name from multiple jobs

DanTup opened this issue ยท 39 comments

DanTup commented

What happened?

The PR from dependabot to upgrade to v4 is failing on my project with this error:

Error: Failed to CreateArtifact: Received non-retryable error: Failed request: (409) Conflict: an artifact with this name already exists on the workflow run

It seems like this is a breaking change that wasn't mentioned in the changelog and I'm not sure if it was deliberate.

There's some discussion about this behaviour in #279 and it suggests that it was fine to do this and there wouldn't be issues as long as the filenames within the artifact are unique. This was convenient to bundle the logs from several shards together into a single artifact rather than having lots of individual zip files to download.

What did you expect to happen?

I expected everything to work the same as in v3 unless it was noted as a deliberate breaking change.

How can we reproduce it?

Create multiple jobs that upload artifacts with the same name (but the files from each job are uniquely named).

Anything else we need to know?

No response

What version of the action are you using?

v4.0.0

What are your runner environments?

linux, window, macos

Are you on GitHub Enterprise Server? If so, what version?

No response

DanTup commented

Actually, it seems this is called out here:

https://github.com/actions/upload-artifact?tab=readme-ov-file#v4---whats-new:~:text=The%20contents%20of%20an%20Artifact%20are%20uploaded%20together%20into%20an%20immutable%20archive.%20They%20cannot%20be%20altered%20by%20subsequent%20jobs.%20Both%20of%20these%20factors%20help%20reduce%20the%20possibility%20of%20accidentally%20corrupting%20Artifact%20files.

The contents of an Artifact are uploaded together into an immutable archive. They cannot be altered by subsequent jobs. Both of these factors help reduce the possibility of accidentally corrupting Artifact files.

It just wasn't included in the "What's changed" section of the Dependabot release notes because it just has a summary saying "Lots has changed". I should've followed the link through.

Seems like this is certainly intended though.

Well, this is a bad news for me. I find convenient to use the upload-artifact to write different files to the same folder in a build matrix. For example to compile custom C extensions for several combinations of Python versions and operating systems, and publishing to a single folder.

like in https://github.com/Neoteroi/BlackSheep/actions/runs/7370452109/job/20056867940

Now if I want to upgrade my workflow, I need to publish to different folders and download artifacts from multiple sources - making the workflow look like a mess compared to how clean it used to look like. For now I stay with v3 and I hope this will be reconsidered in a future version of these actions.

Yeah, I rolled back to v3 too. Until I'm forced to upgrade, the old way is much more convenient for me.

I had to roll back also

Same here, I also use a matrix to build multi platform releases in the same directory, and then zipping them all together, rolling back to v3 :(

Seems like a lot of people have been bitten by this, so although it appears to have been deliberate I'm re-opening for better visibility to see if the authors want to chime in (of course, it's very possible it may just be closed as WAI).

The behaviour of several jobs saving different files, with different names, but in the same directory, to be downloaded from a single archive once all the jobs succeeded, was very desireable and didn't require the use of actions/download-artifact.

https://github.com/psycopg/psycopg/blob/fe097e2e4356a4332a54ae21e1c4307bc7c19b4f/.github/workflows/packages-src.yml

Moving to using v4 seems a major change which, for the moment, we will avoid.

There were only 27 commits according to the v4.0.0 release notes.

Breaking changes (and this definitely is a breaking change) should absolutely be called out in major version bump releases' release notes.

It's explicitly included in the readme:
https://github.com/actions/upload-artifact?tab=readme-ov-file#breaking-changes

That same text should absolutely be included in the release notes.

Conveniently GitHub lets you rewrite release notes at any time, so this can and should be fixed.

@robherley you wrote "Blog post coming soon!" in #466 (comment), I presume that's:
https://github.blog/changelog/2023-12-14-github-actions-artifacts-v4-is-now-generally-available/

But it'd be really good if you had added a comment in the PR itself instead of forcing people to Google for it. (You could also include a link to the blog post in the release notes.)

Beyond that, most of its content, which I will excerpt below should be in the release notes, and probably in the readme. Note that the readme content does not match the blog post.

Blog post first:

  • Artifacts will be scoped to a job rather than a workflow. This allows the artifact to become immediately available to download from the API after being uploaded, which was not possible before.
  • Artifacts v4 is not cross-compatible with previous versions. For example, an artifact uploaded using v3 cannot be used with actions/download-artifact@v4.
  • Using upload-artifact@v4 ensures artifacts are immutable, improving performance and protecting objects from corruption, which would often happen with concurrent uploads. Artifacts should be uploaded separately and then downloaded into a single directory using the two new inputs, pattern and merge-multiple, available in download-artifact@v4. These objects can then be re-uploaded as a single artifact.
  • A single job can upload a maximum of 10 artifacts.

Readme:

  1. On self hosted runners, additional firewall rules may be required.
  2. Uploading to the same named Artifact multiple times.
    Due to how Artifacts are created in this new version, it is no longer possible to upload to the same named Artifact multiple times. You must either split the uploads into multiple Artifacts with different names, or only upload once. Otherwise you will encounter an error.
  3. Limit of Artifacts for an individual job. Each job in a workflow run now has a limit of 10 artifacts.

My data point against v4: Generating documentations.
I have a matrix of jobs (for different versions of build environments), each generates a documentation. I don't want to specify "only generate the docs on this particular version", as the matrix of versions change frequently. I don't care "which" job overwrites the docs generated by "which" other job, as the docs are mostly the same: Just give me any one of them.

The "correct" way of doing things: Name each artifact after the job matrix.
But: The jobs matrix is specified by the docker images e.g. "username/repo:version", which is a bad filename. I really don't want to write a script just to compute a valid filename for the artifact.

Same here, just rolled back to v3 ๐Ÿ˜ž

Just want to chip in that this caused a lot of issues for our company as well.

๐Ÿ‘‹ @RobertoPrevato For your example, you don't have to make too many changes, e.g.

https://github.com/Neoteroi/BlackSheep/blob/b283414c88e2d32675a1ca982d937a5dab75b532/.github/workflows/main.yml#L209-L212

Change that line to:

      - uses: actions/upload-artifact@v4
        with:
          name: dist-${{ matrix.os }}-${{ matrix.python-version }}
          path: dist

Then, in your publish job:

https://github.com/Neoteroi/BlackSheep/blob/b283414c88e2d32675a1ca982d937a5dab75b532/.github/workflows/main.yml#L226-L230

You can have it download all the artifacts matching a pattern to the same directory:

      - name: Download a distribution artifact
        uses: actions/download-artifact@v4
        with:
          pattern: dist-*
          merge-multiple: true
          path: dist

This case is outlined in the migration document: https://github.com/actions/download-artifact/blob/main/docs/MIGRATION.md

I'm happy to help any others with their workflow scenarios, thanks all for the feedback!

@robherley Thank You! I appreciate your help very much, I try that as soon as I get the time.

@robherley

Uploads and downloads must use the same actions versions.

It isn't obvious that you mean "the same major action version." -- If that's what's intended.

@robherley saved my life to not downgrading to v3! Thanks!

I rolled back to v3, too.

Solution provided in the comment, work just fine for v4 actions

- name: Upload artifacts
  uses: actions/upload-artifact@v4
  with:
    name: dist-${{ matrix.os }}
    path: dist

- name: Download artifacts
  uses: actions/download-artifact@v4
  with:
    pattern: dist-*
    merge-multiple: true
    path: dist

I rolled back to v3, too. Anyone knows any alternative repo for that - maybe a fork based on v3 ????
(update) . I started thinking about my own v4 fork and removing that "feature" .Checking now how hard will be to keep my fork synchronised with this repo

@robherley in #478 (comment) you pointed to: https://github.com/actions/download-artifact/blob/main/docs/MIGRATION.md

I believe I've made a faithful reproduction of your workflow (v3 and v4), and it is not a drop-in replacement.
https://github.com/check-spelling-sandbox/artifact-merge-hell/actions/runs/7729203589/job/21071843037

-Run ls -R my-v3-artifact
-my-v3-artifact:
-my-v3-artifact
-
-my-v3-artifact/my-v3-artifact:
+Run ls -R my-v4-artifact
+my-v4-artifact:
 file-macos-latest.txt
 file-ubuntu-latest.txt
 file-windows-latest.txt

Since paths are how people find files, the fact that the paths do not match would break anyone trying to use it.

I'm not sure that's precisely why I gave up, but I can assure you it is one of the problems I encountered.

๐Ÿ‘‹ @jsoref apologies, I had a typo in the migration docs.

The reason why you have an extra my-v3-artifact/my-v3-artifact in the v3 download is the behavior outlined here in the v3 docs.

If the name input parameter is not provided, all artifacts will be downloaded. To differentiate between downloaded artifacts, a directory denoted by the artifacts name will be created for each individual artifact.

These lines should be:

with:
  name: my-artifact
  path: my-artifact

In v4, this behavior is now toggleable with the merge-multiple parameter.

I'll update the migration docs to include the name parameter.

Thanks, with that change, the results now do look compatible: https://github.com/check-spelling-sandbox/artifact-merge-hell/actions/runs/7730117258

Looks like a https://github.com/actions/upload-artifact/tree/main/merge should be solution for this problem. PR #505

Take a look at https://github.com/actions/upload-artifact/blob/main/docs/MIGRATION.md#overwriting-an-artifact

      - name: New override option
        uses: actions/upload-artifact@v4
        with:
          name: build-artifact
          path: ./example
          overwrite: true

Solution provided in the comment, work just fine for v4 actions

- name: Upload artifacts
  uses: actions/upload-artifact@v4
  with:
    name: dist-${{ matrix.os }}
    path: dist

- name: Download artifacts
  uses: actions/download-artifact@v4
  with:
    pattern: dist-*
    merge-multiple: true
    path: dist

It works like a charm, thank you!

Here is an example how it could be solved for BigBlueButton:
bigbluebutton/bigbluebutton@0f726d5.

Rolling back to v3 as well :(
actions/upload-artifact@v4 broke the entire pipeline, and none of the mentioned solutions solved our problem.
In our workflow, we run parallel jobs in matrix strategy that generate loads of files that were uploaded into a single results folder in v3. With v4, in our parallel job, we upload a large number of folders with unique names and use one more job action/upload-artifact/merge@v4 to merge these folders into a single one that is downloadable through the UI.
The solution technically works but is unsuitable for us because we are getting all the artifact folders from parallel jobs uploadable through the UI, which we don't need, and it makes a big mess in our artifacts.

Also had to revert to v3 here microsoft/cppwinrt#1409

@kryshenp, @kennykerr, it's not necessary (and also not a good idea) to revert to v3 (which will stop working in the future). Read my comment above how to fix your code to make v4 working.

Hi,

At first I was also surprise from this breakage, yes it creates an effort of migration and unexpected, I also immediately reverted, but while thinking of it realized why it was done and migrated our usage.

The previous implementation was probably a mistake, the new implementation is more consistent.

The advantages of the new stateless implementation:

  1. Artifacts are available immediately after creation, no need to wait until the workflow completes.
  2. Implementation is much faster as the artifacts are not merged when not needed, it takes less space to manage the process.
  3. Consistent approach of artifacts within workflow or reusable workflow.
  4. Consistent approach when running/rerunning partial workflow.

I hope the above helps to understand the WHY, it is easier to perform migration when we understand the WHY.

Thanks,
Alon

It was just announced that v3 will be disabled on 2024-11-30.

https://github.blog/changelog/2024-04-16-deprecation-notice-v3-of-the-artifact-actions/

It is disappointing that the disabling of v3 was announced before v4 reaching feature-completeness.

We are having the same issue

Then update your code. The description how to do that can be found above.

Take a look at main/docs/MIGRATION.md#overwriting-an-artifact

      - name: New override option
        uses: actions/upload-artifact@v4
        with:
          name: build-artifact
          path: ./example
          overwrite: true

Thanks for the notice! Seems like #501 has fixed the issue here.

If your old configuration merged the artifacts, then overwrite: true would be the wrong choice. Use merge-multiple: true.

I worked around this problem by appending ${{ github.run_number }}-${{ github.run_attempt }} to the artifact name. This guarantees a clean, traceable, unique name on every (re)run.

I'm setting overwrite: true and still getting this am I missing something?

Run actions/upload-artifact@v4
  with:
    name: tf-module-cache-key-file
    retention-days: 3
    path: .terraform/modules/modules.json
    overwrite: true
    if-no-files-found: warn
    compression-level: 6
  
With the provided path, there will be 1 file uploaded
Artifact name is valid!
Root directory input is valid!
Error: Failed to CreateArtifact: Received non-retryable error: Failed request: (409) Conflict: an artifact with this name already exists on the workflow run

@dreinhardt89: you shouldn't use overwrite: true, you should give each of your artifacts a distinct name and then use the merge subaction: https://github.com/actions/upload-artifact/blob/main/merge/README.md

just an fyi to those holding onto v3

actions/upload-artifact@v3 is scheduled for deprecation on November 30, 2024

Unsure if this will stop your from staying v3 (incase something occurs github side at that date)