ldc-developers/ldc

Regression from LDC 1.30 to 1.38 on referencing template instances symbols

ljmf00-wekaio opened this issue · 6 comments

For the following code:

template foo_sym(alias val)
{
    @assumeUsed @section("foo") align(1) __gshared immutable(typeof(val)) foo_sym = val;
}

template fooref_sym(alias val)
{
    @assumeUsed @section("fooref") align(8) __gshared immutable(typeof(val)*) fooref_sym = &foo_sym!val;
}

Somehow, the same template instance of fooref_sym references two addresses for the same foo_sym.

This regression was detected at Weka in a sanity check.
Instead of foo and fooref, we have wtracer_desc and wtracer_tinfo section names.

>>> p (**(cast(TraceDescriptor**)&__start_wtracer_desc+11))
$30 = {
  type = wtracer.core.type.backtraceThrowable,
  fileName = 0x0,
  moduleName = 0x0,
  lineNumber = 0,
  functionName = 0x0,
  fmt = 0x3226d6 <wtracer.core.sections.storeStringInSection!("wtracer_str", "%s(%d): %s\n%([0x%x]%|\n%)").storeStringInSection()._storeStringInSection!()._storeStringInSection> "%s(%d): %s\n%([0x%x]%|\n%)",
  argsTypes = 0xaddbce <wtracer.core.typeinfo.structs.genStructTraceInfo!(wtracer.core.typeinfo.TracerPackedArgsStruct!([], char[64], ulong, char[64], const(void)*[32]).TracerPackedArgsStruct, true, false).genStructTraceInfo>
}
>>> p (**(cast(TraceDescriptor**)&__start_wtracer_desc+23))
$31 = {
  type = wtracer.core.type.backtraceThrowable,
  fileName = 0x0,
  moduleName = 0x0,
  lineNumber = 0,
  functionName = 0x0,
  fmt = 0x3226d6 <wtracer.core.sections.storeStringInSection!("wtracer_str", "%s(%d): %s\n%([0x%x]%|\n%)").storeStringInSection()._storeStringInSection!()._storeStringInSection> "%s(%d): %s\n%([0x%x]%|\n%)",
  argsTypes = 0xaddbce <wtracer.core.typeinfo.structs.genStructTraceInfo!(wtracer.core.typeinfo.TracerPackedArgsStruct!([], char[64], ulong, char[64], const(void)*[32]).TracerPackedArgsStruct, true, false).genStructTraceInfo>
}

These two entries of fooref (wtracer_desc) are exactly the same, but compiler seem to generate two different symbols, which, the first references an actual symbol (index 11) but the second doesn't (index 23). See:

>>> p ((cast(TraceDescriptor**)&__start_wtracer_desc+11))
$34 = (wtracer.core.descriptor.TraceDescriptor **) 0xae4d70 <wtracer.writer.common.log.LOG_DESCRIPTOR_PTR!(8, [], null, null, 0u, "%s(%d): %s\n%([0x%x]%|\n%)", char[64], ulong, char[64], const(void)*[32]).LOG_DESCRIPTOR_PTR>
>>> p ((cast(TraceDescriptor**)&__start_wtracer_desc+23))
$35 = (wtracer.core.descriptor.TraceDescriptor **) 0xae4dd0

LOG_DESCRIPTOR_PTR is analogous to fooref_sym.
I expect the address to be the same, if the template instance is the same.
The thing is that this seem to only happens with certain types in the template. I didn't figure out which do it.

CC @JohanEngelen

This seems a weka compiler specific issue. New versions of the compiler doesn't generate constants in comdat.

I believe this is what weka ldc reverted backthen https://github.com/ldc-developers/ldc/pull/3424/files#diff-000f1d90d6a1b2686a37dfa10b8660f9882420ca96695e5222fe333d3bae9c86L229-L239 . So, therefore, I believe this also regress LDC but on old versions. CC @kinke what is the idea to not compile with comdat anymore? It seems weak_odr/linkonce_odr doesn't fulfil the same requirements.


old LDC:

section "wtracer_tinfo", comdat, align 1, !dbg !993 ; [#uses = 5]

new LDC:

section "wtracer_tinfo", align 1, !dbg !993 ; [#uses = 5]

Also, another question is: shouldn't weak_odr/linkonce_odr already ensure no deduplication? Is this ultimately an LLVM bug? CC @MaskRay given you are an LLVM maintainer and my understanding of your blog post here https://maskray.me/blog/2021-07-25-comdat-and-section-group , I can understand that comdat is simply more powerful, but Clang does indeed add comdat to global constants too.
Is this something LDC perhaps should stick with (do the same as Clang), and what is the reasoning for weak_odr/linkonce_odr not suffice?

what is the idea to not compile with comdat anymore?

There's fortunately (hopefully) enough context in the linked PR's comments.

Also, please specify which linker and version you are using, and whether you get different results with different linkers.

what is the idea to not compile with comdat anymore?

There's fortunately (hopefully) enough context in the linked PR's comments.

clang emits COMDATs for templates and inline functions only (with linkonce_odr), not for regular functions, for both MSVC and Linux targets. I have no idea whether ELF COMDATs provide any benefit and so whether I should try harder and only get rid of them for regular functions, to make @weak work properly.
[...]
You could ask on the forums about the COMDATs on Linux, but I doubt you'll get useful feedback.
If Clang does not emit it, then I think it is fairly safe to omit them.

This is not entirely correct. See https://godbolt.org/z/hzzc1PeMW . It generates the symbol with comdat rules, even though the target is linux.


On the comments @JohanEngelen explains exactly what is going on here in #3589 , so I'll mark this issue as duplicate. Can the mentioned issue be marked as a regression and lets discuss there.