rust-lang/cargo

cargo publish multiple packages at once

alexcrichton opened this issue Β· 32 comments

It would be nice to have a flag to cargo publish which publishes all local packages in a DAG fashion.

Non-atomic publish was added in #14433, stabilization is being tracked in #10948

Notes (edit ehuss):

  • #9507 would block this
  • It would be nice if it supported atomic publishing.
  • --dry-run should work correctly (will need to pretend that the previous crates have been published, maybe via patch?) See also rust-lang/crates.io#1515.
  • See rust-lang/crates-io-cargo-teams#82 for other considerations for a new publish API to better accommodate this.

This would be awesome ;)

I suspect we may want to have a story with #883 if/when we implement this: I could easily imagine unintentionally publishing crates by not realising that I'm depending on them.

botev commented

Is there any progress on this? I think this would a very good thing to have. One of the nice things about crates is that they are modular, thus reduce the compile time of larger projects. However, if you are developing a large library compile times can get pretty slow (when it hits >2min I start to compile after 50% of the fixing, so it can compile while I'm fixing the other 50% of the issues). In this case you can split the library into sub-crates, where each sub-crate is in some sense standalone, or a build-up over "core" structure. However, each one of them would not make any sense on its own. Publishing each individually does not make any sense also and not packaging it together. If we can have this kind of publish --all this would give as an option to reduce compile times without any drawback from the user and publishing side of the library.

Any updates whether this will be implemented and when?

@matthiasbeyer I think this fell through the cracks when we did workspaces. This issue was supposed to track wrapping --all up for publish but I guess it got lost since it was originally opened earlier?

@alexcrichton what's the status?

@wycats AFAIK this has always been in the 'nice to have' category and hasn't progressed to the 'someone has put time into designing this' category.

One thing I just thought about: When doing a workspace release, cargo should build everything and after everything is fine, publish all crates at once... if that is even possible. Not like "build & publish the first crate and continue for each crate" but rather "build everything, then publish everything".

It seems that in a foo-rs, foo-rs/foo-sys project layout, you can't cargo package in foo-rs before foo-rs/foo-sys has been published.

I created a simple PR to deal with this annoying issue; it permits to execute cargo publish from a workspace.

One thing I just thought about: When doing a workspace release, cargo should build everything and after everything is fine, publish all crates at once... if that is even possible. Not like "build & publish the first crate and continue for each crate" but rather "build everything, then publish everything".

Unluckily I did not find a way to implement this logic because cargo requires that all the dependencies of a package are in the repository or the package phase (when the tarball is created) fails; consequently, before publishing a package with a "path" dependency, that dependency must be in the repository.

Something else I would want that I don’t think that can handle is only publishing updated packages, if one of the packages already exists at its current version number it should be downloaded and verified that the new package is identical.

being new to workspaces but not to cargo this feels very much like a paper cut. My finger just got nipped when I tried to publish a new workspace project by following the docs

screen shot 2018-10-09 at 4 45 01 pm

...then realized that order matters. The validation of packages will fail of one of the workspace packages depends on another in the same release but which may not have been published first. In my case it's a very simple ordering but for those new to cargo, something like cargo publish --all could get the ordering right without putting a burden on me at all. Something like that would be a much nicer new user experience, also a much nicer convenience to those that have learned the ordering semantics for the more more manual publishing approach.

It would definitively help me with imag where I publish over 50 crates in one release!

This sounds useful, but I'm wondering what the exact behaviour should be. What if you have a path dependency and there's no version specified, should Cargo modify the Cargo.toml for that?

@torkleyy interesting point. It probably should not modify the Cargo.toml at all, but it would be worth considering whether or not cargo should allow for path dependencies but only for crates within the same workspace.

E.g. If building foo and a dependency on bar is specified via a path, perhaps cargo could implicitly publish the new version of foo with the version of bar that was published during the same cargo publish --all? This would also allow for cargo to easily build everything first locally before publishing any of the packages.

Simplified Steps

I'm imagining cargo publish --all should do something like the following:

  1. Check manifests and version validity of all packages before starting to build any.
  2. Create DAG of workspace packages and determine the build order.
  3. Build packages. If any failures occur, bail out of the whole process.
  4. If all manifests and valid and packages are built, publish all packages in the order in which they were built.

It would be worth considerinng if step 4 should be a special "atomic step" recognised by crates.io so that if for some reason the net drops out or there's a crash the user doesn't end up with only half of their packages published.

@mitchmindtree I would like to also see an extra step between 3 and 4 doing full workspace package validation to replace the current pre-publish validation. This does the normal per-package validation steps with 2 changes:

  1. crates.io is queried for an exact version match of any of the crates. If there is an exact version match then the contents are compared to make sure the crate hasn't changed, if they match the crate is removed from the set to publish. (Basically making publishing a crate idempotent so that you can republish a workspace where you have only updated some of the crates.)

  2. during this validation crates can depend on crates that are either available on crates.io or are part of the current set to publish.

I started working on this here: https://gitlab.com/torkleyy/cargo-publish-all

Thanks for sharing. This looks like a good stop gap for not having this as a built in. The main reason I'd still champion this as a built-in is for consistency as other built-in commands support the --all flag.

What's the status of this issue?

Sorry for the drive by, but in case anyone is wondering, I got inspired by lerna and made https://github.com/pksunkara/cargo-workspaces. rust-analyzer and chalk are using it to auto publish weekly.

  • --dry-run should work correctly (will need to pretend that the previous crates have been published, maybe via patch?)

I am currently using the following process to simulate dry-run publishing a workspace of crates. The process builds a vendor/ directory containing all the published dependencies, and all the crates in the workspace.

Potentially cargo could use this same, or similar, process.

The process is:

  • Temporarily remove git dependencies from workspace Cargo.toml.
  • Vendor all dependencies in a vendor/ directory.
  • Undo changes to Cargo.toml.
  • Package all crates, verification disabled.
  • Extract the *.crate files into vendor/.
  • Add a .cargo-checksum.json file to each crate.
  • Package all crates, verification enabled using vendor/ as source replacement for crates.io.

This process exists in the GitHub Actions workflow below and I use it on ~6 repositories successfully to perform verification on workspaces before starting publishing.

https://github.com/stellar/actions/blob/70940b15e/.github/workflows/rust-publish-dry-run.yml

alamb commented

I would like to say thank you to @epage and everyone who worked on this feature. It has made my life publishing crates from https://github.com/apache/arrow-rs much better

I started working on this here: https://gitlab.com/torkleyy/cargo-publish-all

@torkleyy I just tested your tool on my workspace I'm trying to publish on https://github.com/umccr/htsget-rs/tree/better_ci but no output is returned and fails with exit code 1.

OTOH, I'm trying to publish my first set of crates ever is turning to be a fun circular ride... not the UX I had on the rest of cargo subcommands :_S Here's me trying to "go for the leaves first" and hitting some walls:

/cc @mmalenic

% cargo publish -p htsget-http-actix
    Updating crates.io index
   Packaging htsget-http-actix v0.1.0 (/Users/rvalls/dev/umccr/htsget-rs/htsget-http-actix)
error: failed to prepare local package for uploading

Caused by:
  no matching package named `htsget-http-core` found
  location searched: registry `crates-io`
  required by package `htsget-http-actix v0.1.0 (/Users/rvalls/dev/umccr/htsget-rs/htsget-http-actix)`

% cargo publish -p htsget-http-core
    Updating crates.io index
   Packaging htsget-http-core v0.1.0 (/Users/rvalls/dev/umccr/htsget-rs/htsget-http-core)
   Verifying htsget-http-core v0.1.0 (/Users/rvalls/dev/umccr/htsget-rs/htsget-http-core)
error: failed to verify package tarball

Caused by:
  no matching package named `htsget-search` found
  location searched: registry `crates-io`
  required by package `htsget-http-core v0.1.0 (/Users/rvalls/dev/umccr/htsget-rs/target/package/htsget-http-core-0.1.0)`

% cargo publish -p htsget-search
    Updating crates.io index
   Packaging htsget-search v0.1.0 (/Users/rvalls/dev/umccr/htsget-rs/htsget-search)
   Verifying htsget-search v0.1.0 (/Users/rvalls/dev/umccr/htsget-rs/htsget-search)
error: failed to verify package tarball

Caused by:
  no matching package named `htsget-test-utils` found
  location searched: registry `crates-io`
  required by package `htsget-search v0.1.0 (/Users/rvalls/dev/umccr/htsget-rs/target/package/htsget-search-0.1.0)`

% cargo publish -p htsget-test-utils
   Updating crates.io index
   Packaging htsget-test-utils v0.1.0 (/Users/rvalls/dev/umccr/htsget-rs/htsget-test-utils)
   Verifying htsget-test-utils v0.1.0 (/Users/rvalls/dev/umccr/htsget-rs/htsget-test-utils)
error: failed to verify package tarball

Caused by:
  no matching package named `htsget-http-core` found
  location searched: registry `crates-io`
  required by package `htsget-test-utils v0.1.0 (/Users/rvalls/dev/umccr/htsget-rs/target/package/htsget-test-utils-0.1.0)`
epage commented

@brainstorm Looks like you have a cycle involving a dev-dependency. The simplest way of resolving that is to remove the cycle by removing the version field in dev-depencencys that also have a path. cargo add does this by default. The main downside is that people running tests from the registry won't be able to. In practice, this is just crater (the tool that tests rust changes across all of crates.io).

In cargo-release, we just expect people to use the workaround above. I think cargo-smart-release has some special logic to handle cycles.

epage commented

Trying to summarize this thread with some of my thoughts

Prior art

High-level path

This intentionally leaves of "don't publish if its already published" as I see that as separate, though related, to this issue and it has logic/policies to be worked out.

Multi-package packaging

Requirements

  • Add to cargo package the normal package selection CLI (--package, --exclude, --workspace), including behavior (default-members)
  • Dry-run can verify all of the packages

For dry-run, we'd need to build in-order and patch in the .crates that we already packaged (see #1169 (comment)).

When verifying, we likely should detect dependency cycles to give people errors early (#1169 (comment)).

Multi-package publishing

Because this isn't atomic, we should try to do all verification upfront so there aren't errors along the way

  • Package everything (including verification) before any publishes
    • Also important once we have atomic publish as we'll need all of the .crates
  • Checking if any of the proposed versions are already published
  • Ensuring dependencies are published or will be published as part of this
  • ...

When publishing, we'll also have to do it in-order.

If a publish PUT fails, we need to be clear about what didn't get published for people to recover.

We should track the wait-for-publish timeouts from #11602 on a per-package basis so

  • Publishing a package only waits for its dependencies to be available (or timeout)
  • At the tail end, we only have one timeout if we are waiting on multiple leaf packages

Additional Info

For me, the biggest open question is how to build the DAG for packaging and then pass that up to publishing. Last I looked, the main ways I saw for interacting with the DAG was compilation which is too heavy handed for what we need.

For me, the biggest area of complexity is the dependency + timeout tracking while publishing in-order.

We likely should detect dependency cycles to give people errors early (#1169 (comment)).

Dependency cycles are actually allowed, as long as the package names have been previously registered on crates.io without cycles. There's no version checking performed by crates.io, just a simple name check, so you can publish crates that depend on future unpublished versions of other crates.

epage commented

I've edited it to clarify that is for when verifying as that does not support cycles (minus the automatically removed dev-dependencies)

epage commented

An interesting challenge for us to keep in mind with this is registries that rate-limit, both in terms of finding the right strategy for backing off but also the right UX so someone doesn't publish 300 crates and it takes 24 hours without any clear indication.

This issue is S-accepted, but it seems to depend on #10948, which isn't. Any chance that can get accepted?

Now that #13947 is merged, it's time to look into multi-package publishing:

  • Package everything (including verification) before any publishes
  • Checking if any of the proposed versions are already published
  • Ensuring dependencies are published or will be published as part of this
  • When publishing, we'll also have to do it in-order.

What's a good way to tackle the following?

the biggest open question is how to build the DAG for packaging and then pass that up to publishing.

Would it make sense for packaging to output the packaged order to a file, and let publishing read that? Or should publishing simply re-run the same ordering logic that packaging did, almost like a dry-run packaging?

For clearer status, writing it out with links to issues.

  • Package everything (including verification) before any publishes (mostly done via #13947)
  • Checking if any of the proposed versions are already published (#14338)
  • Ensuring dependencies are published or will be published as part of this
  • When publishing, we'll also have to do it in-order.

Would it make sense for packaging to output the packaged order to a file, and let publishing read that? Or should publishing simply re-run the same ordering logic that packaging did, almost like a dry-run packaging?

imo we shouldn't do sideband communication (using the filesystem to pass state from one function to another).

If we could have a single graph for both, that would be ideal as it removes the risk of the two graph traversals disagreeing.

I think ideally, we'd adjust the packaging abstraction so cargo package and cargo publish call into the same graph generation (and registry inferring() code and pass that to a graph packaging function and then publish walks the graph to publish everything.