vercel/turborepo

Turbo daemon creates / leaves a ton of `<defunct>` processes, accumulating enough sometimes to breach the OS-wide process limit, preventing the creation of any new processes.

NullVoxPopuli opened this issue · 8 comments

Verify canary release

  • I verified that the issue exists in the latest Turborepo canary release.

Link to code that reproduces this issue

I think: all turbo projects running turbo while in interactive-rebase.

This is a pretty bad bug, because MacOS only has a limit of ~ 5600 processes, and once you hit that, you can't spawn terminals, can't open apps, can't create new tabs in the browser, can't run ps, even.

You have to have already had activity monitor (or similar) open so that you can kill the turbo daemon process. Else you may be forced to reboot.

Which canary version will you have in your reproduction?

2.3.1-canary.0

Enviroment information

❯ pnpm turbo info
turbo 2.3.1-canary.0

CLI:
   Version: 2.3.1-canary.0
   Path to executable: <.pnpm>/turbo-darwin-arm64@2.3.1-canary.0/node_modules/turbo-darwin-arm64/bin/turbo
   Daemon status: Running
   Package manager: pnpm9

Platform:
   Architecture: aarch64
   Operating system: macos
   WSL: false
   Available memory (MB): 10455
   Available CPU cores: 12

Environment:
   CI: None
   Terminal (TERM): alacritty
   Terminal program (TERM_PROGRAM): unknown
   Terminal program version (TERM_PROGRAM_VERSION): unknown
   Shell (SHELL): /opt/homebrew/Cellar/bash/5.2.32/bin/bash
   stdin: false

Setup, check processes:

ps -ef | grep defunct | wc -l
# 1 or 2

Normally, an OS should be around < 1000 processes:

ps -ef | wc -l
# I usually hover around 600 to 800

Scenario A (inconsistent)

  • be in interactive rebase
    (I'm splitting commits into more commits)
  • have prepare or postinstall trigger turbo's build
  • run turbo again (maybe for lint, or whatever)

Scenario B (inconsistent)

  • after changing a dependency of a package

Test:

ps -ef | grep defunct | wc -l
# 807

Test after upgrading to latest canary (noting that we run build in postinstall):

❯ ps -ef | grep defunct | wc -l
#    1435

I have an ongoing monitor for this running every second in a terminal that I just leave up all the time.

❯ watch -n 1 "echo \"All: \$(ps -ef | wc -l), Defunct: \$(ps -ef | grep defunct | wc -l)\""

And with pstree we can see that these all come from turbo

# get a list of all unique parent processes for each defunct process
❯ ps -ef | grep defunct | awk '{print $3}' | sort -u

# pass each of these to pstree
while IFS= read -r pid; do
    pstree -p $pid
done <<< $(ps -ef | grep defunct | awk '{print $3}' | sort -u)

Which will print something like this:

-+= 00001 root /sbin/launchd
 \-+= 11557 $USER /opt/homebrew/opt/borders/bin/borders
   \--- 11558 $USER <defunct>
-+= 00001 root /sbin/launchd
 \-+= 43271 $USER <.pnpm>/turbo-darwin-arm64@2.2.3/node_modules/turbo-darwin-arm64/bin/turbo --skip-infer daemon
   |--- 43359 $USER <defunct>
   |--- 43361 $USER <defunct>
   # and a few many hundred more
   \--- 57042 $USER <defunct>

Expected behavior

no defunct processes exist ever, as the OS will not halt these.

Actual behavior

defunct processes are left laying around.

To Reproduce

It's possible this is reproducible in these OSS repos:

I somewhat regularly have to kill the top level turbo daemon on Linux due to CPU usage -- but it's maybe possible that the reason for that is the same root reason that is causing me to observe the behavior that has resulted in me reporting this issue for MacOS.

In both cases, Linux (where I do most of my OSS) and Mac (where I do my closed-source employer-owned work), Killing the turbo daemon processes immediately makes any of my machines happier -- cleaning up defunct processes (macos) or freeing up cpu cycles (linux)

Additional context

No response

We've seen this on other developer machines at my company as well.

If either of you could share daemon logs (turbo daemon status should display the logfile) that would be helpful. We should not be spawning child processes from the daemon.

Here is what I got:

❯ pnpm turbo daemon status
# ...
✓ daemon is running
log file: <repo>/.turbo/daemon/e224a4a441d772ef-turbo.log.2024-11-19
uptime: 16m 6s 566mss
pid file: /var/folders/wk/w99lck4x7_5930c7gj65r3s40000gp/T/turbod/e224a4a441d772ef/turbod.pid
socket file: /var/folders/wk/w99lck4x7_5930c7gj65r3s40000gp/T/turbod/e224a4a441d772ef/turbod.sock
ope, big file

there is a lot of text

There was a problem saving your comment. 
Your comment is too long (maximum is 65536 characters). 
Please try again.

oops 🙈

here is a file tho

output.txt

as I was poking around in here, I noticed there was a lot of activity from watchman cookies.

It seems this is happening nearly daily for me -- can't really pinpoint what is causing the defunct processes to show up. In Activity Monitor, I do occasionally see > 20 git processes spawn, and then go away -- maybe related? idk.