andimarafioti/florence2-finetuning

libcudnn_cnn_train.so.8 issue

sylvain471 opened this issue · 2 comments

Hello,

Florence2 sounds very promising and gpu-poor friendly as compared to current VMLs.

I'd love to get the fine-tuning script to work but, when I manage to get all packages finally installed, I keep getting a complaint about libcudnn_cnn

Could not load library libcudnn_cnn_train.so.8. Error: /usr/local/cuda-12.1/lib/libcudnn_cnn_train.so.8: undefined symbol: _ZN5cudnn3cnn34layerNormFwd_execute_internal_implERKNS_7backend11VariantPackEP11CUstream_stRNS0_18LayerNormFwdParamsERKNS1_20NormForwardOperationEmb, version libcudnn_cnn_infer.so.8

I tested using standard venv and with UV, despite libcudnn_cnn being present

$ find | grep libcudnn_cnn
./.venv/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_cnn_train.so.8
./.venv/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_cnn_infer.so.8

both config gave me the same error 😢

any idea what might solve this problem?

well, digging into obscure github issues pytorch/pytorch#119989 , running the command

unset LD_LIBRARY_PATH

before python train.py solves the problem! at least for now...

Super !