cmd/compile: use common DWARF prefixes?
josharian opened this issue · 5 comments
Run dwarfdump helloworld and look through. There's a lot of duplication in variable names, e.g.
0x00037bb5: TAG_variable [3]
AT_name( "runtime.statictmp_2" )
AT_location( [0x00000000010cb170] )
AT_type( {0x00000000000290d2} ( runtime.plainError ) )
AT_external( 0x01 )
0x00037bd9: TAG_variable [3]
AT_name( "runtime.statictmp_1" )
AT_location( [0x00000000010cb110] )
AT_type( {0x00000000000290d2} ( runtime.plainError ) )
AT_external( 0x01 )
0x00037bfd: TAG_variable [3]
AT_name( "runtime.statictmp_4" )
AT_location( [0x00000000010cb1e0] )
AT_type( {0x00000000000290d2} ( runtime.plainError ) )
AT_external( 0x01 )
0x00037c21: TAG_variable [3]
AT_name( "runtime.statictmp_3" )
AT_location( [0x00000000010cb1d0] )
AT_type( {0x00000000000290d2} ( runtime.plainError ) )
AT_external( 0x01 )
Note the common package prefix. Is there any way in DWARF to indicate that these are part of a shared structure called "runtime" (maybe even pretending that runtime is a C struct)? And there are bigger shared prefixes as well, like package+type for methods.
I believe that this would provide worthwhile binary size savings. Observe that of the 36 bytes that the "runtime.statictmp_4" entry consumes, 20 is from its name. This wouldn't matter as much if we fixed #11799, but it would still be worth doing.
Stuffing package variables inside a struct would definitely break delve but nothing we couldn't fix. The maintainers of the gdb and lldb plugins should also be notified of this. However, IMHO this is a unnecessary stopgap solution if compression gets implemented.
Does anyone cc'd here have time to look into compression for 1.10?
I looked again yesterday but got stuck--I couldn't figure out whether it was supposed to happen at the DWARF level (some DWARF indicator that subsequent data was compressed) or at the file container level ("hey ELF, this section is compressed"). I ended up assuming the latter, and since I'm on a mac figured I'd start with mach-o, but I couldn't find mach-o support for compressed sections.
This all came up for me again because I made an otherwise worthwhile compiler change that added a few kb to helloworld's TEXT and nothing to DATA, but the DWARF additions were so big that it increased helloworld's binary size by 5%.
DWARF compression is done at the file container level. There are two ways to do it. The older way is to prefix the DWARF section name with a 'z', as in '.zdebug_info'. The newer way, which applies to sections of all types, not just DWARF sections, is to set the SHF_COMPRESSED flag in the ELF section flags.
Clearly Mach-O doesn't support the SHF_COMPRESSED flag, but I don't see any reason that it wouldn't support .zdebug_*. Looking at the gdb source code, it seems that it ought to work. I suppose the only way to find out is to try it and see what happens.
A .zdebug-* section starts with the four characters "ZLIB" followed by a uint64 of the uncompressed size in big-endian order followed by the compressed data.
SHF_COMPRESSED works differently. The section data starts with a compression header: a 32-bit word with the compression type (1 for ZLIB), on 64-bit systems a 32-bit padding word, a 32-bit or 64-bit word with the uncompressed section size, a 32-bit or 64-bit word with the uncompressed alignment.
Thanks, Ian! That's exactly what I needed to know. I might play with this next week, if no one beats me to it.