baldurk/renderdoc

Injection failed on linux with musl-libc

Closed this issue · 1 comments

Description

RenderDoc fails to inject&capture applications running on a musl-libc based linux operating system.
This is not an issue with musl-libc but with renderdoc relying on glibc behavior (of running init calls before the program entry point).

Steps to reproduce

The execution environment matter (libc musl).

$ renderdoccmd capture ./run
Launching './run'
Failed to create & inject: RenderDoc injection failed: Couldn't connect to target program. Check that it didn't crash or exit during early initialisation, e.g. due to an incorrectly configured working directory.

Environment

  • RenderDoc version: renderdoccmd x64 v1.30 built from NO_GIT_COMMIT_HASH_DEFINED
  • Operating System: Linux 6.6.8, running with musl libc (Alpine Linux x86_64)
  • Graphics API: GL (not relevant)

Details

The issue is caused by the way RenderDoc hooks into Linux/POSIX processes.
Currently RenderDoc does some "magic" by setting a breakpoint (using ptrace) at the target entry point and expect init hooks in the target/child process to have ran before entering the entry point.
On Linux the program entry point is not the same as the main function (I don't know how this is done in Windows).
The libc is expected to go through the "C runtime" initialization before jumping into the application's main function.
In the case of musl-libc the init/constructor are called after the entry point and before the main function (as expected).

However RenderDoc stops the target at the entry point and waits for sockets to open, which is never the case on musl, thus fails with the error message injection failed: Couldn't connect to target program ...


I think RenderDoc simply need a synchronization with the child/target process after the target has initialized the socket used to communicate with the renderdoc capture.
I think this synchronization can be achieved using a simple signal syscall (without using ptrace). You can find a very simple implementation of this below (as an example).

// target.c
#include <stdio.h>
int main(int argc, char **argv, char **envp)
{
        fprintf(stderr, "main\n");
        return 0;
}
// hook.c
#include <stdio.h>
#include <signal.h>
__attribute__((constructor)) void hook(void)
{
        fprintf(stderr, "setup done\n");
        raise(SIGSTOP);
        fprintf(stderr, "hooked & resumed\n");
}
// main.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <sys/wait.h>
int main(int argc, char **argv, char **envp)
{
        pid_t pid;
        if (argc == 1) {
                fprintf(stderr, "usage: ./main <cmd> [arg...]\n");
                exit(1);
        }
        pid = fork();
        if (pid == -1) {
                fprintf(stderr, "fork %m\n");
                exit(1);
        }
        if (pid == 0) { /* child */
                extern char **environ;
                setenv("LD_PRELOAD", "./libhook.so", 1);
                execve(argv[1], argv + 1, environ);
        } else {
                int stat;
                waitpid(pid, &stat, WUNTRACED);
                fprintf(stderr, "do the hook\n");
                kill(pid, SIGCONT); /* resume the child */
        }
        return 0;
}
# Makefile
all: libhook.so main target
libhook.so: hook.c
        $(CC) -fPIC -shared -o $@ $^
main: main.c
        $(CC) -o $@ $^
target: target.c
        $(CC) -o $@ $^
clean:
        rm libhook.so main target
$ ./main target
setup done
do the hook
hooked & resumed
main

If musl varies in this way to be incompatible then I would not consider musl use as a supported configuration. If you want to use RenderDoc I would suggest then I would recommend re-linking your program against glibc, as I do not want to make extensive changes to the hooking in order to support multiple incompatible libcs.