bazel-contrib/rules_nodejs

RFC: npm_fetch_packages rule

alexeagle opened this issue · 6 comments

In other modern Bazel rulesets (commonly called the "external" model like https://github.com/dillon-giacoppo/rules_python_external), the dependency story is roughly:

  • let users bring the configuration files they use with their package manager, eg. requirements.txt
  • Bazel's job should be to understand all the packages/versions you want, fetch the tarballs, and lay them all out in individual external repositories
  • need to make sure the postinstall for each package gets run (maybe do npm install on each package individually)
  • Consumers list the packages in their deps (possibly with some syntax sugar so they all come from one load like the requirements() helper in rules_python_external

We could do this for rules_nodejs, I'll call this proposed repository rule npm_fetch_packages() for the sake of discussion.

For npm, this has tradeoffs.

Downsides:

  • we have to lay out individual packages in the execroot/runfiles and we aren't as smart as the package managers at doing this. We could always hoist to node_modules, but then programs that require multiple versions of the same library (pretty common) will stop working. We could always nest.
  • I'm worried there may be other ways we lose compatibility. Yarn 2 is a good cautionary tale, and we have no resources in rules_nodejs to try to chase a long-tail compatibility breakage. For example package A could have a postinstall that monkey-patches package B, which won't work if we install them individually.

Upsides:

  • obvious how to depend on a mix of packages from different install targets
  • only actually need to fetch the dependencies for this build. Right now you have to npm install or yarn install a whole monorepo package.json file even if you only need typescript to type-check, for example.

https://www.npmjs.com/package/@yarnpkg/lockfile maybe useful to ask yarn for the manifest of tarballs and SHAs that Bazel should fetch

@alexeagle I think this is a great idea. I can see it being a big improvement for large monorepos like ours with one package.json at the top-level. yarn install is arduous

This is definitely seems more bazel-idiomatic, and overall an improvement. I think if we continue to maintain the current method as well that significantly decreases how much we would need to be concerned about the downsides, though if maintaining two became a problem then I would prefer this rules_nodemodules_external method over the existing.

With regards to maintainership, I wonder if the current method should even be maintained (long term) as it introduces the following issues:

  • Breaks the possibility of cross-builds
  • It locks the entire Bazel workspace while waiting for npm_install to finish, even when not a single NodeJS-related target is involved in the Bazel invocation, as analyzing the BUILD files require resolving the @npm repo (given we load() from then)
  • Does not allow packages to be cached by the HTTP CAS cache

Moving to a world like the proposed in the issue will probably make supporting yarn/npm workspaces "free" and solve #399, while the current behavior negates most benefits from using Bazel.

The current method has to be maintained long-term unless we provide a migration path that's relatively easy and accessible to everyone.

It's wise to avoid load'ing from any external repo you don't want to fetch - so maybe reorganizing the sources a bit allows you to avoid the undesirable load statement (e.g. make a "js" subdirectory and don't request targets in there)

I decided to work on this in a fresh repo, as rules_nodejs is getting too hard to maintain. See aspect-build/rules_js