ROCm/rocBLAS

[Bug]: rocblas cannot load TensileLibrary.dat

Epliz opened this issue · 3 comments

Epliz commented

Describe the bug

Running a tensorflow model fails with the message

rocBLAS error: Cannot read /opt/rocm/rocblas/lib/rocblas/library/TensileLibrary.dat: No such file or directory

To Reproduce

rocBLAS from ROCM 5.4.3 (from package manager) on Ubuntu 22.04.2 LTS.
Run the code from https://www.tensorflow.org/text/tutorials/text_generation .
ROCBLAS_TENSILE_LIBPATH is NOT set.
but ROCM_PATH is set to /opt/rocm.

When ROCBLAS_TENSILE_LIBPATH is set to /opt/rocm/lib/rocblas/library/, it works as expected.

Expected behavior

No error?

Log-files

strace output attached

strace_output.txt

Environment

Hardware description
CPU AMD Ryzen 1700
GPU MI100
dpkg -s rocblas | grep Version
Version: 5.4.3.50403-121~22.04
Version: 2.46.0.50403-121~22.04

environment.txt

Additional context

Similar to #1267

Can solve the issue by setting ROCBLAS_TENSILE_LIBPATH as well.

rkamd commented

@Epliz,
Could you remove /opt/rocm/rocblas/lib entry from LD_LIBRARY_PATH, re-run the application and provide me the strace log.
Also provide me the output of ldd <app> before and after LD_LIBRARY_PATH change.

rkamd commented

@Epliz,
Did you get a chance to re-run the application after removing the entry from LD_LIBRARY_PATH, did that solve the issue? If not , please attach a output of the ldd command and strace log.

rkamd commented

@Epliz , Closing this ticket assuming it is resolved or not relevant anymore.