amd/ZenDNN

`libblis-mt.so.3` error while installing torch

Closed this issue · 2 comments

Hi,

I get this error when installing the torch wheel provided with ZenDNN.

To check the installed version of PT:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/root/miniconda3/lib/python3.9/site-packages/torch/__init__.py", line 202, in <module>
    from torch._C import *  # noqa: F403
ImportError: libblis-mt.so.3: cannot open shared object file: No such file or directory

I am using a Ubuntu20.04 docker container in which I have installed miniconda.

Background

  • Output of lscpu
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   43 bits physical, 48 bits virtual
CPU(s):                          48
On-line CPU(s) list:             0-47
Thread(s) per core:              2
Core(s) per socket:              24
Socket(s):                       1
NUMA node(s):                    4
Vendor ID:                       AuthenticAMD
CPU family:                      23
Model:                           8
Model name:                      AMD Ryzen Threadripper 2970WX 24-Core Processor
Stepping:                        2
Frequency boost:                 enabled
CPU MHz:                         2194.713
CPU max MHz:                     3000.0000
CPU min MHz:                     2200.0000
BogoMIPS:                        5987.96
Virtualization:                  AMD-V
L1d cache:                       768 KiB
L1i cache:                       1.5 MiB
L2 cache:                        12 MiB
L3 cache:                        64 MiB
NUMA node0 CPU(s):               0-5,24-29
NUMA node1 CPU(s):               12-17,36-41
NUMA node2 CPU(s):               6-11,30-35
NUMA node3 CPU(s):               18-23,42-47
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prc
                                 tl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user point
                                 er sanitization
Vulnerability Spectre v2:        Mitigation; Full AMD retpoline, IBPB conditional, STI
                                 BP disabled, RSB filling
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
                                 mca cmov pat pse36 clflush mmx fxsr sse sse2 ht sysca
                                 ll nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc 
                                 rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm a
                                 perfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1
                                  sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_l
                                 m cmp_legacy svm extapic cr8_legacy abm sse4a misalig
                                 nsse 3dnowprefetch osvw skinit wdt tce topoext perfct
                                 r_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pst
                                 ate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep
                                  bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsav
                                 ec xgetbv1 xsaves clzero irperf xsaveerptr arat npt l
                                 brv svm_lock nrip_save tsc_scale vmcb_clean flushbyas
                                 id decodeassists pausefilter pfthreshold avic v_vmsav
                                 e_vmload vgif overflow_recov succor smca
  • I have extracted PT_v1.9.0_ZenDNN_v3.2_Python_v3.9.zip, aocl-blis-linux-aocc-3.1.0.tar.gz and aocc-compiler-3.2.0.tar in /home/deps folder.
  • Exported required Environment variables
export ZENDNN_LOG_OPTS=ALL:0
export OMP_NUM_THREADS=48
export OMP_WAIT_POLICY=ACTIVE
export OMP_PROC_BIND=FALSE
export OMP_DYNAMIC=FALSE
export ZENDNN_GIT_ROOT=/home/deps/pyzendnn/PT_v1.9.0_ZenDNN_v3.2_Python_v3.9_2021-12-03T10/ZenDNN
export ZENDNN_PARENT_FOLDER=/home/deps/pyzendnn/PT_v1.9.0_ZenDNN_v3.2_Python_v3.9_2021-12-03T10
export ZENDNN_AOCC_COMP_PATH=/home/deps/aocc-compiler-3.2.0
export ZENDNN_BLIS_PATH=/home/deps/amd-blis
export ZENDNN_PRIMITIVE_CACHE_CAPACITY=1024
export GOMP_CPU_AFFINITY=0-47
  • Output of echo $LD_LIBRARY_PATH
/home/deps/pyzendnn/PT_v1.9.0_ZenDNN_v3.2_Python_v3.9_2021-12-03T10/ZenDNN/_out/lib/:/home/deps/pyzendnn/PT_v1.9.0_ZenDNN_v3.2_Python_v3.9_2021-12-03T10/ZenDNN/external/googletest/lib:/home/deps/amd-blis/lib/:/home/deps/aocc-compiler-3.2.0/lib:/home/deps/aocc-compiler-3.2.0/lib32:/home/deps/pyzendnn/PT_v1.9.0_ZenDNN_v3.2_Python_v3.9_2021-12-03T10/ZenDNN/_out/lib/:/home/deps/pyzendnn/PT_v1.9.0_ZenDNN_v3.2_Python_v3.9_2021-12-03T10/ZenDNN/external/googletest/lib:/home/deps/amd-blis/lib/:/home/deps/aocc-compiler-3.2.0/lib:/home/deps/aocc-compiler-3.2.0/lib32:

The AMD BLIS lib is in the path, but torch doesn't seem to find it. Any idea how to resolve this? Thanks!

Hi @gitcommitypos ZenDNN 3.2 is compatible with AMD-BLIS v3.0.6 (aocl-linux-aocc-3.0-6.tar.gz) and AOCC 3.0.0 (aocc-compiler-3.0.0.tar) only. you can find the above version in respective archive download section.
Please have a look into 6.2 and 6.3 section of ZenDNN User Guide (57300_ZenDNN_UG_Rev_3.2.pdf).

Hi @ratan-prasad Thank you for the help. I will install the appropriate versions.

Also, I think the issue was that aocl-blis-linux-aocc-3.1.0.tar.gz has two folder inside its lib - lp64 and ilp64. I fixed the above issue by simply adding lp64 (the common default) to the LD_LIBRARY_PATH and it worked.