NuxiNL/cloudlibc

Hello World stdout fd blackholes with some posted tutorials

Opened this issue · 1 comments

Hello, I am trying to write my first CloudABI program, following along with the tutorials. I have a Clang-based CloudABI toolchain installed, along with cloudabi-run.

hello.c:

#include <stdio.h>

int main(void) {
    dprintf(1, "Hello World!\n");
}

cloudabi.yml:

%TAG ! tag:nuxi.nl,2015:cloudabi/
---
- !fd stdout

Trace:

$ x86_64-unknown-cloudabi-cc -o hello hello.c
$ cloudabi-run -e hello <cloudabi.yml
WARNING: Attempting to start executable using emulation.
Keep in mind that this emulation provides no actual sandboxing.
Though this is likely no problem for development and testing
purposes, using this emulator in production is strongly
discouraged.

So Hello World is never printed; no segmentation fault occurs, no error message about stdout access is presented. Furthermore, execution exits with a zero exit code, indicating "success".

Update

I looked at some more examples and see that file descriptor 1 appears to no longer work out of the box as stdout in CloudABI. I changed my code to:

#include <stdio.h>
#include <stdlib.h>

int m(int stdout) {
    dprintf(stdout, "Hello World!\n");
    return EXIT_SUCCESS;
}

#ifdef __CloudABI__
    #include <argdata.h>
    #include <program.h>

    void program_main(const argdata_t *ad) {
        int stdout;
        argdata_get_fd(ad, &stdout);
#else
    #include <unistd.h>

    void main() {
        int stdout = STDOUT_FILENO;
#endif
        exit(m(stdout));
    }

And am now able to build and run my lil app! This one is a polyglot, so it compiles and runs with either plain vanilla Clang or with CloudABI. Also, CloudABI appears to work with more modern Clang/LLVM/LLD versions, including v6.0. Could we update the documentation to reflect this?

Could we update the different per-OS tutorials to fix the stdout file descriptor part?

A larger question is why are guarded? I thought CloudABI was meant to protect sensitive components that can break a system. Do stdout/stderr somehow contribute to an increased attack surface?

Yes. I agree the site needs an overhaul. Especially to deal with NuxiNL/cloudabi#10. I'll try to work on this soon, also incorporating the suggestions you made here.

To respond to this question:

A larger question is why are guarded? I thought CloudABI was meant to protect sensitive components that can break a system. Do stdout/stderr somehow contribute to an increased attack surface?

They are not necessarily an attack surface, but there's also some cohesion/usability/orthogonality concerns to it.

The main argument for omitting stdin/stdout/stderr is that they overlap with how program_main() / program_exec() / program_spawn() work. Those functions try to pack the entire configuration that needs to be passed from one process to another in a single argdata tree. If we also supported stdin/stdout/stderr in the traditional sense, the functions could be extended to pass those in separately.

int program_exec(int program, const argdata_t *ad, int stdin, int stdout, int stderr);

But this is a slipperly slope: what would happen if someone comes along, saying program_exec() should also have a dedicated parameter to pass in a hypothetical 'PRNG file descriptor' to override the behaviour of cloudabi_sys_random_get()? What about passing in 'userspace clock file descriptors' to override the process's observation of time? Soon we'd have a version of program_exec() taking 10 file descriptor arguments.

int program_exec(int program, const argdata_t *ad, int stdin, int stdout, int stderr, int clock_realtime, int clock_monotonic, int prng, [...]);

What I'd rather like to see is that if such features are ever needed, they are implemented without growing the API. They can all be passed in using argdata. If needed, we can work towards standards/conventions on what the schema of the argdata may look like.

There's also the issue with file descriptor numbering. On UNIX systems there is already the issue that programs misbehave when launching them without file descriptors 0, 1 and 2 in place. The first files opened by the child process will end up having one of those numbers, meaning an printf() will end up writing garbage to a random file. I guess that this would become an even bigger problem for us, as CloudABI is more file descriptor centric (as opposed to path/namespace centric) than traditional UNIX.

Finally, omitting designated stdin/stdout/stderr file descriptors was also done to remove the mindset that only a single input/output source/sink exists. There's absolutely nothing against having multiple log streams for different subsystems in your application.