matheusmoreira/liblinux

System call function ABI

matheusmoreira opened this issue · 4 comments

The system_call function is declared as:

long system_call(long number, ...);

And implemented as:

long system_call(long number, long _1, long _2, long _3, long _4, long _5, long _6);

Is this correct? Are these binary interfaces compatible? How are syscall functions implemented in other libraries?

musl implementation details

  1. Seven __syscallN functions
    • One for each possible arity: N in [0, 6]
    • Avoids moving values to unused registers
    • All types are long
  2. One syscall C preprocessor macro
    • Uses auxiliary macros to count the number of arguments
    • Emits a call to the correct __syscallN function
    • Doesn't actually exist in the compiled library
    • Not usable by any language lacking the C preprocessor
  3. One __syscall function for x86_64

The 3rd function is generic and works for all combinations of arguments. This is what I tried to do in liblinux: I declared the system_call function as a variadic function and defined it to handle up to 6 arguments. I don't think the variadic function prototype is compatible with the definition, though. They're unlikely to use the same ABI. It's working now for some reason but I don't think it's gonna stay that way.

It's not clear what the correct prototype for the generic function would be. long syscall() seems to be the best approximation of its semantics but I'm not sure if calling that function in many different ways is defined behavior: that prototype defines a function with "unspecified but fixed number and type(s) of arguments".

Approaches 1 and 2 in combination seem to be good for a C library.


How the macro works

#define __SYSCALL_NARGS_X(a,b,c,d,e,f,g,h,n,...) n
#define __SYSCALL_NARGS(...) __SYSCALL_NARGS_X(__VA_ARGS__,7,6,5,4,3,2,1,0,)
#define __SYSCALL_CONCAT_X(a,b) a##b
#define __SYSCALL_CONCAT(a,b) __SYSCALL_CONCAT_X(a,b)
#define __SYSCALL_DISP(b,...) __SYSCALL_CONCAT(b,__SYSCALL_NARGS(__VA_ARGS__))(__VA_ARGS__)

#define __syscall(...) __SYSCALL_DISP(__syscall,__VA_ARGS__)
  1. __syscall(60, 0)
  2. __SYSCALL_DISP(__syscall, 60, 0)
  3. __SYSCALL_CONCAT(__syscall, __SYSCALL_NARGS(60, 0))(60, 0)
  4. __SYSCALL_CONCAT(__syscall, __SYSCALL_NARGS_X(60, 0, 7, 6, 5, 4, 3, 2, 1, 0, ))(60, 0)
  5. __SYSCALL_CONCAT(__syscall, 1)(60, 0)
  6. __SYSCALL_CONCAT_X(__syscall, 1)(60, 0)
  7. __syscall1(60, 0)

The most important macros are __SYSCALL_NARGS and __SYSCALL_NARGS_X. The former forwards all its arguments plus 8 numbers decreasing from 7 to 0 to the latter, which returns the 9th number. The arguments push the numbers back by their count, changing the number in the 9th position in increasing order.

minibase implementation details

  • Seven syscallN functions
    • One for each possible arity: N in [0, 6]
    • Avoids moving values to unused registers
    • All types are long

Linux kernel nolibc.h

Turns out there's a nolibc.h header in the Linux kernel source. I was not aware of it. Seems to have been added recently.

  • Seven my_syscallN functions
    • One for each possible arity: N in [0, 6]
    • Avoids moving values to unused registers
    • All types are long

I declared the system_call function as a variadic function and defined it to handle up to 6 arguments.

Just read src/internal/syscall.h again. It contains a similar declaration:

typedef long syscall_arg_t;

hidden long __syscall(syscall_arg_t, ...);

This seems to be the prototype for the hidden function defined in src/internal/x86_64/syscall.s:

.global __syscall
.hidden __syscall
.type __syscall,@function
__syscall:
	movq %rdi,%rax
	movq %rsi,%rdi
	movq %rdx,%rsi
	movq %rcx,%rdx
	movq %r8,%r10
	movq %r9,%r8
	movq 8(%rsp),%r9
	syscall
	ret

Which is similar to the code generated by the compiler for the current system_call implementation for x86_64:

mov    %rdi,%rax
mov    %r8,%r10
mov    %rsi,%rdi
mov    %r9,%r8
mov    %rdx,%rsi
mov    0x8(%rsp),%r9
mov    %rcx,%rdx
syscall
retq

I suppose there's no problem after all.

I don't think the variadic function prototype is compatible with the definition, though. They're unlikely to use the same ABI. It's working now for some reason but I don't think it's gonna stay that way.

Seems I was wrong about this. This post cites the x86_64 ABI documentation and the register allocation example on the referenced page shows that the registers are the same as the ones used for normal function calls: %rdi, %rsi and so on.

It's not clear what the correct prototype for the generic function would be.

Using a variable arguments list is appropriate. Even on architectures where the arguments are passed on the stack such as x86.