RFC: npm_fetch_packages rule
alexeagle opened this issue · 6 comments
In other modern Bazel rulesets (commonly called the "external" model like https://github.com/dillon-giacoppo/rules_python_external), the dependency story is roughly:
- let users bring the configuration files they use with their package manager, eg.
requirements.txt
- Bazel's job should be to understand all the packages/versions you want, fetch the tarballs, and lay them all out in individual external repositories
- need to make sure the postinstall for each package gets run (maybe do
npm install
on each package individually) - Consumers list the packages in their
deps
(possibly with some syntax sugar so they all come from oneload
like therequirements()
helper in rules_python_external
We could do this for rules_nodejs, I'll call this proposed repository rule npm_fetch_packages()
for the sake of discussion.
For npm, this has tradeoffs.
Downsides:
- we have to lay out individual packages in the execroot/runfiles and we aren't as smart as the package managers at doing this. We could always hoist to
node_modules
, but then programs that require multiple versions of the same library (pretty common) will stop working. We could always nest. - I'm worried there may be other ways we lose compatibility. Yarn 2 is a good cautionary tale, and we have no resources in rules_nodejs to try to chase a long-tail compatibility breakage. For example package A could have a postinstall that monkey-patches package B, which won't work if we install them individually.
Upsides:
- obvious how to depend on a mix of packages from different install targets
- only actually need to fetch the dependencies for this build. Right now you have to
npm install
oryarn install
a whole monorepo package.json file even if you only needtypescript
to type-check, for example.
https://www.npmjs.com/package/@yarnpkg/lockfile maybe useful to ask yarn for the manifest of tarballs and SHAs that Bazel should fetch
@alexeagle I think this is a great idea. I can see it being a big improvement for large monorepos like ours with one package.json at the top-level. yarn install
is arduous
This is definitely seems more bazel-idiomatic, and overall an improvement. I think if we continue to maintain the current method as well that significantly decreases how much we would need to be concerned about the downsides, though if maintaining two became a problem then I would prefer this rules_nodemodules_external
method over the existing.
With regards to maintainership, I wonder if the current method should even be maintained (long term) as it introduces the following issues:
- Breaks the possibility of cross-builds
- It locks the entire Bazel workspace while waiting for npm_install to finish, even when not a single NodeJS-related target is involved in the Bazel invocation, as analyzing the BUILD files require resolving the @npm repo (given we load() from then)
- Does not allow packages to be cached by the HTTP CAS cache
Moving to a world like the proposed in the issue will probably make supporting yarn/npm workspaces "free" and solve #399, while the current behavior negates most benefits from using Bazel.
The current method has to be maintained long-term unless we provide a migration path that's relatively easy and accessible to everyone.
It's wise to avoid load'ing from any external repo you don't want to fetch - so maybe reorganizing the sources a bit allows you to avoid the undesirable load statement (e.g. make a "js" subdirectory and don't request targets in there)
I decided to work on this in a fresh repo, as rules_nodejs is getting too hard to maintain. See aspect-build/rules_js