symbol lookup error with likwid-appDaemon.so
Closed this issue · 11 comments
Hi,
I recently encountered the following error when I tried to use the timeline mode of likwid-perfctr to collect power usage data for rodinia gpu benchmarks on nvidia GPU:
/var/tmp/likwid/bin/likwid-perfctr -G 0 -W POWER -t 100ms "./run"
--------------------------------------------------------------------------------
CPU name: Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz
CPU type: Intel Icelake SP processor
CPU clock: 2.39 GHz
--------------------------------------------------------------------------------
Old LD_PRELOAD=likwid-appDaemon.so
/bin/sh: symbol lookup error: /var/tmp/likwid/lib/likwid-appDaemon.so: undefined symbol: bfromcstr
--------------------------------------------------------------------------------
At first, I thought that the likwid-appDaemon.so failed to include something at runtime. However, when I checked the linked libraries of likwid-appDaemon.so, I received the following output:
ldd /var/tmp/likwid/lib/likwid-appDaemon.so
linux-vdso.so.1 (0x00007fff6d44c000)
liblikwid-gotcha.so.5.3 => /var/tmp/likwid/lib/liblikwid-gotcha.so.5.3 (0x00007f1a73fd6000)
libc.so.6 => /lib64/libc.so.6 (0x00007f1a739ca000)
libm.so.6 => /lib64/libm.so.6 (0x00007f1a7367f000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f1a7347b000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1a73258000)
/lib64/ld-linux-x86-64.so.2 (0x00007f1a73dbf000)
It seems that it does have located all the required libraries for linking. Do you have any ideas about what could be the potential reason causing this problem?
You are on the right track. All libs were found but it misses one lib: liblikwid.so
. You could try to set LD_PRELOAD=liblikwid.so
before starting likwid-perfctr
. I check the Makefiles why likwid-appDaemon.so
is not linked to liblikwid.so
Hi,
Thank you for your response. However, the issue still persists even when I manually set LD_PRELOAD=liblikwid.so
.
/var/tmp/likwid/bin/likwid-perfctr -G 0 -W POWER -t 100ms ./run ─╯
--------------------------------------------------------------------------------
CPU name: Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz
CPU type: Intel Icelake SP processor
CPU clock: 2.39 GHz
--------------------------------------------------------------------------------
Old LD_PRELOAD=likwid-appDaemon.so:liblikwid.so
/bin/sh: symbol lookup error: /var/tmp/likwid/lib/likwid-appDaemon.so: undefined symbol: bfromcstr
--------------------------------------------------------------------------------
OK, so we have to rebuild to fix it. In src/access-daemon/Makefile
, change the CPPFLAGS
:
-CPPFLAGS := $(DEFINES) $(INCLUDES) -L$(PREFIX)/lib
+CPPFLAGS := $(DEFINES) $(INCLUDES) -L../..
And rebuild (make distclean && make
). It should now be linked with liblikwid.so
.
This should now be fixed in the master
branch. It would be great if you could test it and comment on/close the issue.
Hi,
I cloned the updated repo and built Likwid again. However, the problem still persists. I'm wondering whether this issue has something to do with the build options: direct or perf-event. Since I don't have sudo rights on my testbed system, I can't work with AccessDaemon.
I will try with the other modes
I have the same issue while trying to run timeline mode on AMD GPUs. I'm on commit #69971d, which should include your previous fix for the missing liblikwid.so
. My ldd
looks like this:
ldd /var/tmp/likwid-lua/lib/likwid-appDaemon.so
linux-vdso.so.1 (0x00007ffd9acdb000)
liblikwid-gotcha.so.5.3 => /var/tmp/likwid-lua/lib/liblikwid-gotcha.so.5.3 (0x00007fb34442c000)
libc.so.6 => /lib64/libc.so.6 (0x00007fb343e4e000)
libm.so.6 => /lib64/libm.so.6 (0x00007fb343acc000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fb3438c8000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fb3436a8000)
/lib64/ld-linux-x86-64.so.2 (0x00007fb344213000)
I compiled with ROCM 5.7.2 and use the accessdameon and appdaemon. Running with a prefix manual LD_PRELOAD
did not work either.
I've identified the error. likwid-appDaemon.so currently does not contain the bstrlib functions. Because liblikwid.so does not export its symbols, you get an undefined reference / symbol lookup. Unfortunately likwid's Makefile currently ignores a lot of errors, hence why it goes unnoticed at build time.
I already have a local fix, but I currently cannot get nvmon to build in order to test that fix. I'll hopefully be able to sort this out today.
The fix is now on master, but nvmon is currently broken. So you even though your problem may be solved now, you will probably run into a segmentation fault, which I'm currently trying to fix.
Okay, assuming you use CUDA 11, everything should work fine now. CUDA 12 should show a compile error because the perfworks backends still needs to be updated.