ojkelly/yarn.build

Unable to build repo with co-dependent workspaces

Closed this issue · 5 comments

Describe the bug
When I run yarn build, I get no output and at some point Node crashes due to OOM.

╰─ time yarn build                           

<--- Last few GCs --->

[322:0x609d110]   892070 ms: Scavenge (reduce) 4090.3 (4095.9) -> 4089.3 (4096.9) MB, 7.0 / 0.0 ms  (average mu = 0.294, current mu = 0.276) allocation failure 
[322:0x609d110]   892078 ms: Scavenge (reduce) 4090.3 (4095.9) -> 4089.3 (4096.9) MB, 6.6 / 0.0 ms  (average mu = 0.294, current mu = 0.276) allocation failure 
[322:0x609d110]   892141 ms: Scavenge (reduce) 4090.3 (4095.9) -> 4089.3 (4096.9) MB, 7.1 / 0.0 ms  (average mu = 0.294, current mu = 0.276) allocation failure 


<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0xa04200 node::Abort() [/usr/bin/node]
 2: 0x94e4e9 node::FatalError(char const*, char const*) [/usr/bin/node]
 3: 0xb7860e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/usr/bin/node]
 4: 0xb78987 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/bin/node]
 5: 0xd33215  [/usr/bin/node]
 6: 0xd33d9f  [/usr/bin/node]
 7: 0xd41e2b v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/bin/node]
 8: 0xd444e5 v8::internal::Heap::HandleGCRequest() [/usr/bin/node]
 9: 0xceab27 v8::internal::StackGuard::HandleInterrupts() [/usr/bin/node]
10: 0x1059cca v8::internal::Runtime_StackGuardWithGap(int, unsigned long*, v8::internal::Isolate*) [/usr/bin/node]
11: 0x1400039  [/usr/bin/node]
yarn build  930.00s user 22.41s system 106% cpu 14:55.61 total

To Reproduce
I'm currently assuming this is due to workspace A depending on workspace B and vice versa.

I assume this could be reproduced in the example in this repo. I haven't had time to try that yet.

Expected behavior
In circular dependencies, projects still only be built once.

In my case, I don't even need projects to be built according to their topology at all. I only compile TS in my builds and all builds access only the current source of any other workspace. The build artifacts are only relevant at runtime.

So it would be fine to just run a build in all "dirty" workspaces at the same time.

Desktop (please complete the following information):

  • OS: Windows/WSL

Ah, yes this is a situation that we should handle. There is a build log designed to track this and prevent building twice, but I suspect this is just hitting a recursive loop somewhere in the planning stage. We just need to keep a set of all packages added to the plan, and check against those when checking new deps.

Somewhere in this function https://github.com/ojkelly/yarn.build/blob/trunk/packages/plugins/plugin-build/src/commands/supervisor/index.ts#L365

I can probably get to this in a week or so, unless you or someone else wants to have a go.

I thought about it some more and I'm no longer sure if my project setup is even "legal" or if a solution is as simple as I thought.

If I declare a loop in my dependencies, then who is to say another compilation of the dependency wouldn't have an effect on the outcome of the compilation of the dependant? I feel like this would either require sophisticated checking or a lax approach that treats the situation as "I hope you know what you're doing".

I also noticed that a yarn workspace foreach --topological will also detect the loop and error out.

I'll probably resolve my circular dependency, as I feel I'm opening myself up to other issues down the road.

That being said, a detection for this scenario would probably make for a better overall experience if a developer gets into this situation.

I commented on your PR, but I'll add some thoughts here to.

In general circular deps are probably a failure case. Mathematically they're impossible to resolve, as they will infinitely recurse to each other.

We definitely should be catching it, and printing a good error message.

Thinking about the goal of this project, it's in the same line as Bazel, Pants and Buck. Where we combine soundness of dependencies (that yarn gives us for free :D) with smart building that takes that dependency tree into account.

While I also commented on the other issue, I'll also wanted to add to a thought here. I realized late yesterday night, that the topological build is pretty core to yarn.builds behavior. I assume this is due to the bundling that yarn.build also provides, which probably implies a reliance on build output.

In my scenario, where I purely build for local debugging purposes, the build output is not relevant to the dependents. So a topological build actually slows down the process unnecessarily.

I'd understand if a full parallel build of all targets does not fit the design of yarn.build, but I would consider it very beneficial to control this aspect of the build.

I'm not sure if this information could be deduced from the projects themselves, if this would require a switch, or if it's even feasible to support both.

I'm going to close this for now (though feel free to reopen if need be).

yarn.build expects your dependencies to be a directed acyclic graph. In general, if you have control keeping your dependencies as a DAG is much easier in the long run.