Compiled PyFlex does not work on Ubuntu 20
Skylion007 opened this issue · 8 comments
I have been trying to compile the latest version of SoftGym on a Ubuntu 20 machine, however, I have been unable to load the compile pyflex.so from either the system or a conda interperter. The error I keep getting is that the symbol __powf_finite
is not defined which seems to be related to the libc version.
I have been using CUDA11.6, PyBind2.9.1 and Ubuntu 20. I have tested this issue on Python 3.9, 3.8, and 3.7 and it has caused the same issue on each. I tried compiling with clang, but I got several errors that prevented compilation altogether.
Can you copy and paste your full error message so that we can better diagnose?
Also, please exactly reproduce your steps.
When trying to import pyflex:
ImportError: .....pyflex.so: undefined symbol: __powf_finite
@DanielTakeshi Any updates?
Hi, I encountered the same issue today w/PyFleX and figured out that the precompiled static library NvFlexExtReleaseCUDA
uses __powf_finite
function, which is not included in the latest libc++ google/filament#2146 (comment)
$ strings ../../lib/linux64/NvFlexExtReleaseCUDA_x64.a | grep finite
__powf_finite
Unfortunatelly we cannot easily re-compile NVIDIA FleX (proprietary software).
I just tried the following workaround and it worked locally (outside docker).
- create
libc_compat.c
that only contains the following line
float __powf_finite(float x, float y) { return powf(x, y); }
- and then link it to binary at
CMakeLists
add_library(libc_compat ${ROOT}/bindings/libc_compat/libc_compat.c)
...
target_link_libraries(${EXAMPLE_BIN} PRIVATE ${ROOT}/lib/linux64/NvFlexExtReleaseCUDA_x64.a)
target_link_libraries(${EXAMPLE_BIN} PRIVATE libc_compat)
$ cmake -H. -Bbuild
$ make -j -C build
That is, I created the entity of __powf_finite
by myself and linked so that NvFlexExtReleaseCUDA
can refer to it.
It should work. I hope this helps.
info
- g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
- conda 4.5.11 Python3.7.0
- Ubuntu20.04
@denkiwakame It would be really useful if we could detect this error by checking the libc version and automatically apply this fix. Would you be willing to look into opening a PR?
I don't mind creating a PR though, in my humble opinion, this is not "fix", but a "temporary workaround'' .
- 1️⃣ I created the old libc-compatible dummy library for
NvFleXExtReleaseCUDA
, and the function just fallbacks topowf
instead of the original__powf_finite
. - 2️⃣ In my understanding, we can ``fix'' the issue only if we re-compile
NVIDIA FleX
without `-ffast-math` (which may cause a performance issue) https://bugzilla.redhat.com/show_bug.cgi?id=1803203- or, re-compile
NVIDIA FleX
with latest libc - .... , which are not possible for us since NVIDIA open-sources only their democodes https://github.com/NVIDIAGameWorks/FleX
- The problem is not due to neither the SoftGym nor PyFleX, but the precompiled NVIDIA FleX which depends on the older libc and CUDA9.
- or, re-compile
- 3️⃣ It seems that the original authors only support Ubuntu 16.04 or 18.04 (in docker). We should not extend supported platforms unless the maintainers are eager to do so, which will be a bit too much on their plate.
- (side note) As long as I tested locally, we don't even need cuda-docker environments when compiling (all we need is
libcudart9.1.a
and statically link it to the python binding alongside withNvFleX
).
- (side note) As long as I tested locally, we don't even need cuda-docker environments when compiling (all we need is
Btw, have you resolved the problem? Although I applied a simple workaround, it would also be appreciated if you find out a better solution for this :D