alrevuelta/cONNXr

Problem with generated resolvers (1/2)

Closed this issue · 14 comments

There is a problem with the autogenerated resolvers (the ones that map a given operator with the function, i.e. argmax with argmax__float)

Lets use resolve_operator__onnx__argmax__12 as an example. This function returns a given function depending on the type that is used (i.e. operator__onnx__argmax__12__T_tensor_float). The problem here is that if one the functions is not implemented, the compiler can't of course find the symbol and it gives an error.

This was introduced in #22 and fixed by commenting the types that are not implemented, but should be fixed, since in most of the cases we won't implement all types (float, int,...) for a given operator.

Can this be solved with weakrefs? So if the symbol is not found it automatically fallbacks to an empty operator stub?

@nopeslide

@alrevuelta will look into it.

I tried the mechanism with a minimal example and it works:

main.c

#include <stdio.h>

extern __attribute__((weak)) char* func(void);

int main() {

    printf("func @ %p\n",func);

    if (func) {
        printf("func: %s\n", func());
    } else {
        printf("func not defined\n");
    }
    return 0;
}

func.c

char* func(void) {
    return "hello";
}
$ gcc -std=c99 -c main.c -o main.o
$ gcc -std=c99 -c func.c -o func.o
$ gcc -std=c99 main.o -o no_func
$ ./no_func 
func @ (nil)
func not defined
$ gcc -std=c99 main.o func.o -o func
$ ./func   
func @ 0x55ff463ed1a2
func: hello

the extern __attribute__((weak)) line actually produces a function pointer, which is by default NULL. when the linker finds a matching function it "updates" the pointer.

ah sorry, I misunderstood the problem

the address operators inside the return statements are wrong. the resolver should return the function pointer itself, not the address of the function pointer. will update the generator.

@nopeslide
Actually what I meant was your first reply. The third command is not working for me under macOS:

$ gcc -std=c99 main.o -o no_func

Undefined symbols for architecture x86_64:
  "_func", referenced from:
      _main in main.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

After a quick search I found this where one of the solutions suggests to add the following flags -Wl,-flat_namespace,-undefined,dynamic_lookup and that seems to work for me.

After a quick search I found this where one of the solutions suggests to add the following flags -Wl,-flat_namespace,-undefined,dynamic_lookup and that seems to work for me.

these flags are not present in my ld. are these mac specific flags?

Looks like its mac specific. Seems also that its not really a good idea.

"Weak linking with weak_import really only works well with dynamic libraries. You can get it to work with static linking (by specifying -undefined dynamic_lookup as suggested above) but this isn't such a hot idea. It means that no undefined symbols will be detected until runtime. This is something I would avoid in production code, personally."

Will look further into this.

since we are building a static binary no symbol search should be done at runtime. could you pass the static flag and see if it still does not work or if flat_namespace is still needed?

are you developing on a mac? if not we could keep these flags and just trust the linux gcc to report any linker errors

If you mean something like:

gcc -std=c99 main.o -o no_func -static

I'm geetting:

ld: library not found for -lcrt0.o
clang: error: linker command failed with exit code 1 (use -v to see invocation)

And according to gcc man:

This option will not work on Mac OS X unless all libraries (including libgcc.a) have also been compiled with -static. Since neither a static version of libSystem.dylib nor crt0.o are provided, this option is not useful to most people.

Yep, I'm developing on mac.

I looked at a few other projects with Mac ports (i.e. gcc). it seems they're all using these flags.
Could you try if the linker accepts missing hard refs with these flags by adding a function prototype and using it without defining the actual function?
If so, this would be a problem, if not we may use them.
Alternatively I could rewrite the Makefile to partially link everything operator related with these flags.
Hopefully this would allow "normal" linking of everything else

You mean these flags? -Wl,-flat_namespace,-undefined,dynamic_lookup?

yes

Solved in #34