edf-hpc/verrou

Incompatibility between Valgrind 3.13 and recent binutils

HadrienG2 opened this issue · 4 comments

I tried verrou-fying a Rust program of mine, but got all kinds of weird symptoms. Making verrou generate an exclude list made me discover a problem which I encountered before:

__log10_finite  /lib64/libm-2.27.so
__ieee754_exp_fma       /lib64/libm-2.27.so
__ieee754_log_fma       /lib64/libm-2.27.so

Clearly, not every function in the program is present. I suspect that is because Rust does not follow the usual structure of a C/++ program (my "main" function is actually called _ZN13trois_photons4main28_$u7b$$u7b$closure$u7d$$u7d$17ha2ce800db5acffacE ). In the past, I could hack around this with a suitable gen-above parameter, but since 4991694 gen-above is gone as it was assumed that recent changes made it obsolete.

Can you help me figure out what's wrong or, if all else fails, bring gen-above back?

Now the --gen-exclude option (and so --gen-source) take into account only symbols (line) which contains instrumented floating point instructions. For us the --gen-above was only useful to avoid the symbol used during initialization which usually do not contain floating point instructions. So I think --gen-above is now more useful and the default value was source of problems.
So :

  • If there is symbol which contains floating point operation which are not generated, there is a bug, and I would appreciate a test cas to reproduce it
  • If you have a good reason to use --gen-above, I curious (and I will bring back --gen-above).

After further investigation, it seems I misidentified the problem. This is not the "different main function" issue which I encountered in Python before, but rather another issue which I also encountered before in C++, namely the fact that valgrind 3.13 is not compatible with the DWARF info generated by bfd in current binutils (I use 2.31).

So, the bug is in upstream valgrind, the easiest workaround currently is to link with gold instead of bfd, and as for the long-term fix, a patch has landed in valgrind's master, and valgrind 3.14 cannot come soon enough ^^'

A good way to check if one is encountering this bug is to randomly drop a NaN in the middle of the program that is being instrumented by verrou. If the backtrace is corrupted, then it's time to switch to gold.

Hi, have you tried using Verrou with an up-to-date Valgrind version?

This can be done by following the usual manual installation procedure, but taking the master branch for Valgrind. You'll have to use the valgrind-release branch from Verrou (possibly merging it with Verrou's current master if you also want a bleeding-edge version). Be aware that some chunks of the verrou patch for valgrind will fail ; this should not be a problem.

Vagrind 2.14 should be released tomorrow; we'll try our best to release Verrou 2.1.0 soon after (our current goal is to release the version before November 11th, for the SuperComputing conference).

PS: renaming the issue so that it can be found more easily by other users facing the same problem

I can confirm that building against the final valgrind 3.14 package (not just valgrind master) resolves this issue. Eager to see the next Verrou release!