libcudnn_cnn_train.so.8 issue
sylvain471 opened this issue · 2 comments
sylvain471 commented
Hello,
Florence2 sounds very promising and gpu-poor friendly as compared to current VMLs.
I'd love to get the fine-tuning script to work but, when I manage to get all packages finally installed, I keep getting a complaint about libcudnn_cnn
Could not load library libcudnn_cnn_train.so.8. Error: /usr/local/cuda-12.1/lib/libcudnn_cnn_train.so.8: undefined symbol: _ZN5cudnn3cnn34layerNormFwd_execute_internal_implERKNS_7backend11VariantPackEP11CUstream_stRNS0_18LayerNormFwdParamsERKNS1_20NormForwardOperationEmb, version libcudnn_cnn_infer.so.8
I tested using standard venv and with UV, despite libcudnn_cnn being present
$ find | grep libcudnn_cnn
./.venv/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_cnn_train.so.8
./.venv/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_cnn_infer.so.8
both config gave me the same error 😢
any idea what might solve this problem?
sylvain471 commented
well, digging into obscure github issues pytorch/pytorch#119989 , running the command
unset LD_LIBRARY_PATH
before python train.py
solves the problem! at least for now...
eloise471 commented
Super !