rui314/mold

Compilation on EL-family Linux with GCC leaves `ld.mold` zombies.

v-instrumentix opened this issue · 2 comments

Please use Dockerfile as pasted below (also available here).
This Dockerfile installs gcc, cmake, make and mold, defines trivial CMake project that uses mold as linker and provides shell script wrapping cmake + cmake --build followed by check for mold processes left.

Steps to replicate are as follows (assuming Dockerfile in current directory):

  • Build and start container:
    docker build -tt . ; docker run -it --rm t su -
  • In container run wrapper script few times:
    bash /x/b
    After each run new zombie process ld.mold will appear (sample output also pasted below).

Tested with GCC 11, 12, 13 either with Ninja or Make.

[root@969b9d96458c ~]# bash /x/b
-- Configuring done (0.3s)
-- Generating done (0.0s)
-- Build files have been written to: /x/build
[ 50%] Building CXX object CMakeFiles/a.dir/main.cpp.o
[100%] Linking CXX executable a
[100%] Built target a
root          98  0.0  0.0      0     0 pts/0    Z+   16:06   0:00 [ld.mold] <defunct>
root         113  0.0  0.0  16452  1096 pts/0    S+   16:06   0:00 grep mold
[root@969b9d96458c ~]# bash /x/b
-- Configuring done (0.0s)
-- Generating done (0.0s)
-- Build files have been written to: /x/build
[ 50%] Building CXX object CMakeFiles/a.dir/main.cpp.o
[100%] Linking CXX executable a
[100%] Built target a
root          98  3.0  0.0      0     0 pts/0    Z    16:06   0:00 [ld.mold] <defunct>
root         135  0.0  0.0      0     0 pts/0    Z+   16:06   0:00 [ld.mold] <defunct>
root         150  0.0  0.0  16452  1272 pts/0    S+   16:06   0:00 grep mold
[root@969b9d96458c ~]# bash /x/b
-- Configuring done (0.0s)
-- Generating done (0.0s)
-- Build files have been written to: /x/build
[ 50%] Building CXX object CMakeFiles/a.dir/main.cpp.o
[100%] Linking CXX executable a
[100%] Built target a
root          98  1.0  0.0      0     0 pts/0    Z    16:06   0:00 [ld.mold] <defunct>
root         135  1.5  0.0      0     0 pts/0    Z    16:06   0:00 [ld.mold] <defunct>
root         172  0.0  0.0      0     0 pts/0    Z+   16:06   0:00 [ld.mold] <defunct>
root         187  0.0  0.0  16452  1112 pts/0    S+   16:06   0:00 grep mold
[root@969b9d96458c ~]# logout

Dockerfile for building MRE:

ARG SOURCE_IMAGE="almalinux:8"
FROM ${SOURCE_IMAGE}
#
# Install minimal set of utils
#
RUN dnf install -y gcc-toolset-13-gcc-c++ cmake make
RUN dnf install -y epel-release
RUN dnf install -y mold
#RUN dnf install -y --enablerepo powertools ninja-build
WORKDIR /x
#
# Minimal CMake file using MOLD
#
RUN <<EOF cat > CMakeLists.txt
project(molddefunc)
cmake_minimum_required(VERSION 3.25)
add_link_options(-fuse-ld=mold)
add_executable(a main.cpp)
EOF
#
# Script wrapping build steps
RUN <<EOF cat > b
#!/usr/bin/env bash
set -Eeou pipefail
source /opt/rh/gcc-toolset-13/enable
echo 'int main() {}' > /x/main.cpp
cmake -B/x/build -S/x
cmake --build /x/build
ps axuww | grep mold
EOF
CMD ["su", "-"]

mold spawns a child process to do actual linking to hide the latency of process exit. That is, even exit() takes a few hundred milliseconds for processes with large memory image like the linker, and we hide it by spawning a child process.

Zombie processes are processes that have already terminated, but their parents haven't checked their exit statuses using waitpid.

Usually, when a parent process terminates before its children, the children are adopted by process number 1 (usually /sbin/init), and init calls waitpid (or possibly wait) to reclaim process table slots occupied by these orphaned processes.

bash does that too when it is invoked as pid 1. However, su doesn't do that.

That's why zombie processes remain in your docker environment. This issue is not limited to mold; it is generally assumed in Unix that orphan processes are reclaimed by pid 1, so other program may also leave zombies in your environment.

There are two "solutions" to the problem:

  • zombie processes have died already, and most of their resources have already been reclaimed. They only occupy process table entries. So you can safely ignore them. Or,
  • run your docker as docker run -it --rm t bash instead of docker run -it --rm t su -.

Thank you for your exemplary beautiful answer.