conda-forge/openblas-feedstock

OpenBLAS is suspiciously slow (wrt. BLIS/MKL on AMD)

Closed this issue · 5 comments

Issue

OpenBLAS is suspiciously slow in numpy (order of magnitude slower than both BLIS and MKL, on an AMD 3950x!).

Steps

  • Create an MKL environment: conda create -n mkl numpy mkl
  • Create a BLIS environment: conda create -n blis numpy blis nomkl
  • Create an OpenBLAS environment: conda create -n openblas numpy openblas nomkl
  • Start a jupyter notebook/lab (in each environment, separately): $ OMP_NUM_THREADS=1 BLIS_NUM_THREADS=1 MKL_NUM_THREADS=1 jupyter lab
  • Run the following code to get timings:
import numpy as np
sizes = (1, 2, 3, 4, 32, 64, 127, 128, 129, 1023, 1024, 1025, 4096, 4096*2-1, 4096*2, 4096*2+1)
best_times = np.zeros(len(sizes))
for i, s in enumerate(sizes):
    arr = np.random.rand(s, s)
    arrT = np.random.rand(s, s)
    t = %timeit -o arr @ arrT
    best_times[i] = t.best

I checked that CPU usage never exceeded 100.0 in top in all cases, throughout the full benchmark, until the very end.

Result

image

Last point is around 25s in both MKL and BLIS; it is 3min30s in OpenBLAS. Last time I did something similar, OpenBLAS was on par with MKL. Again I insist: CPU usage was capped at 100% in all cases, there is no underlying multithreading here.

Conda environment


Environment (conda list):
$ conda list
[...]
openblas                  0.3.17          pthreads_h4748800_0    conda-forge
[...]

Full list here:

$ conda list

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
alsa-lib                  1.2.3                h516909a_0    conda-forge
anyio                     3.3.0            py39hf3d152e_0    conda-forge
argon2-cffi               20.1.0           py39h3811e60_2    conda-forge
async_generator           1.10                       py_0    conda-forge
atk-1.0                   2.36.0               h3371d22_4    conda-forge
attrs                     21.2.0             pyhd8ed1ab_0    conda-forge
babel                     2.9.1              pyh44b312d_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
bleach                    3.3.1              pyhd8ed1ab_0    conda-forge
brotlipy                  0.7.0           py39h3811e60_1001    conda-forge
ca-certificates           2021.5.30            ha878542_0    conda-forge
cairo                     1.16.0            h6cf1ce9_1008    conda-forge
certifi                   2021.5.30        py39hf3d152e_0    conda-forge
cffi                      1.14.6           py39he32792d_0    conda-forge
chardet                   4.0.0            py39hf3d152e_1    conda-forge
charset-normalizer        2.0.0              pyhd8ed1ab_0    conda-forge
colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
cryptography              3.4.7            py39hbca0aa6_0    conda-forge
cycler                    0.10.0                     py_2    conda-forge
dbus                      1.13.6               h48d8840_2    conda-forge
debugpy                   1.4.1            py39he80948d_0    conda-forge
decorator                 5.0.9              pyhd8ed1ab_0    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
entrypoints               0.3             pyhd8ed1ab_1003    conda-forge
expat                     2.4.1                h9c3ff4c_0    conda-forge
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 hab24e00_0    conda-forge
fontconfig                2.13.1            hba837de_1005    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
freetype                  2.10.4               h0708190_1    conda-forge
fribidi                   1.0.10               h36c2ea0_0    conda-forge
gdk-pixbuf                2.42.6               h04a7f16_0    conda-forge
gettext                   0.19.8.1          h0b5b191_1005    conda-forge
giflib                    5.2.1                h36c2ea0_2    conda-forge
glib                      2.68.3               h9c3ff4c_0    conda-forge
glib-tools                2.68.3               h9c3ff4c_0    conda-forge
graphite2                 1.3.13            h58526e2_1001    conda-forge
graphviz                  2.48.0               h85b4f2f_0    conda-forge
gst-plugins-base          1.18.4               hf529b03_2    conda-forge
gstreamer                 1.18.4               h76c114f_2    conda-forge
gtk2                      2.24.33              h539f30e_1    conda-forge
gts                       0.7.6                h64030ff_2    conda-forge
harfbuzz                  2.8.2                h83ec7ef_0    conda-forge
icc_rt                    2020.2                intel_254    numba
icu                       68.1                 h58526e2_0    conda-forge
idna                      3.1                pyhd3deb0d_0    conda-forge
importlib-metadata        4.6.1            py39hf3d152e_0    conda-forge
ipykernel                 6.0.3            py39hef51801_0    conda-forge
ipython                   7.25.0           py39hef51801_1    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
jbig                      2.1               h7f98852_2003    conda-forge
jedi                      0.18.0           py39hf3d152e_2    conda-forge
jinja2                    3.0.1              pyhd8ed1ab_0    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
json5                     0.9.5              pyh9f0ad1d_0    conda-forge
jsonschema                3.2.0              pyhd8ed1ab_3    conda-forge
jupyter_client            6.1.12             pyhd8ed1ab_0    conda-forge
jupyter_core              4.7.1            py39hf3d152e_0    conda-forge
jupyter_server            1.10.1             pyhd8ed1ab_0    conda-forge
jupyterlab                3.0.16             pyhd8ed1ab_0    conda-forge
jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
jupyterlab_server         2.6.1              pyhd8ed1ab_0    conda-forge
kiwisolver                1.3.1            py39h1a9c180_1    conda-forge
krb5                      1.19.1               hcc1bbae_0    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.36.1               hea4e1c9_1    conda-forge
lerc                      2.2.1                h9c3ff4c_0    conda-forge
libblas                   3.9.0           5_h92ddd45_netlib    conda-forge
libcblas                  3.9.0           5_h92ddd45_netlib    conda-forge
libclang                  11.1.0          default_ha53f305_1    conda-forge
libdeflate                1.7                  h7f98852_5    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libevent                  2.1.10               hcdb4288_3    conda-forge
libffi                    3.3                  h58526e2_2    conda-forge
libgcc-ng                 11.1.0               hc902ee8_2    conda-forge
libgd                     2.3.2                h78a0170_0    conda-forge
libgfortran-ng            11.1.0               h69a702a_0    conda-forge
libgfortran5              11.1.0               h6c583b3_0    conda-forge
libglib                   2.68.3               h3e27bee_0    conda-forge
libgomp                   11.1.0               hc902ee8_2    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
liblapack                 3.9.0           5_h92ddd45_netlib    conda-forge
libllvm11                 11.1.0               hf817b99_2    conda-forge
libogg                    1.3.4                h7f98852_1    conda-forge
libopenblas               0.3.17          pthreads_h8fe5266_0    conda-forge
libopus                   1.3.1                h7f98852_1    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libpq                     13.3                 hd57d9b9_0    conda-forge
librsvg                   2.50.7               hc3c00ef_0    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libstdcxx-ng              11.1.0               h56837e0_2    conda-forge
libtiff                   4.3.0                hf544144_1    conda-forge
libtool                   2.4.6             h58526e2_1007    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libvorbis                 1.3.7                h9c3ff4c_0    conda-forge
libwebp                   1.2.0                h3452ae3_0    conda-forge
libwebp-base              1.2.0                h7f98852_2    conda-forge
libxcb                    1.13              h7f98852_1003    conda-forge
libxkbcommon              1.0.3                he3ba5ed_0    conda-forge
libxml2                   2.9.12               h72842e0_0    conda-forge
llvmlite                  0.37.0rc2        py39hf484d3e_0    numba
lz4-c                     1.9.3                h9c3ff4c_0    conda-forge
markupsafe                2.0.1            py39h3811e60_0    conda-forge
matplotlib                3.4.2            py39hf3d152e_0    conda-forge
matplotlib-base           3.4.2            py39h2fa2bec_0    conda-forge
matplotlib-inline         0.1.2              pyhd8ed1ab_2    conda-forge
mistune                   0.8.4           py39h3811e60_1004    conda-forge
mysql-common              8.0.25               ha770c72_2    conda-forge
mysql-libs                8.0.25               hfa10184_2    conda-forge
nbclassic                 0.3.1              pyhd8ed1ab_1    conda-forge
nbclient                  0.5.3              pyhd8ed1ab_0    conda-forge
nbconvert                 6.1.0            py39hf3d152e_0    conda-forge
nbformat                  5.1.3              pyhd8ed1ab_0    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
nest-asyncio              1.5.1              pyhd8ed1ab_0    conda-forge
nomkl                     1.0                  h5ca1d4c_0    conda-forge
notebook                  6.4.0              pyha770c72_0    conda-forge
nspr                      4.30                 h9c3ff4c_0    conda-forge
nss                       3.67                 hb5efdd6_0    conda-forge
numba                     0.54.0rc1       np1.16py3.9hc547734_g9bed2ebb2_0    numba
numpy                     1.21.1           py39hdbf815f_0    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openblas                  0.3.17          pthreads_h4748800_0    conda-forge
openjpeg                  2.4.0                hb52868f_1    conda-forge
openssl                   1.1.1k               h7f98852_0    conda-forge
packaging                 21.0               pyhd8ed1ab_0    conda-forge
pandoc                    2.14.1               h7f98852_0    conda-forge
pandocfilters             1.4.2                      py_1    conda-forge
pango                     1.48.7               hb8ff022_0    conda-forge
parso                     0.8.2              pyhd8ed1ab_0    conda-forge
pcre                      8.45                 h9c3ff4c_0    conda-forge
pexpect                   4.8.0              pyh9f0ad1d_2    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    8.3.1            py39ha612740_0    conda-forge
pip                       21.2.1             pyhd8ed1ab_0    conda-forge
pixman                    0.40.0               h36c2ea0_0    conda-forge
prometheus_client         0.11.0             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.19             pyha770c72_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pygments                  2.9.0              pyhd8ed1ab_0    conda-forge
pyopenssl                 20.0.1             pyhd8ed1ab_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyqt                      5.12.3           py39hf3d152e_7    conda-forge
pyqt-impl                 5.12.3           py39h0fcd23e_7    conda-forge
pyqt5-sip                 4.19.18          py39he80948d_7    conda-forge
pyqtchart                 5.12             py39h0fcd23e_7    conda-forge
pyqtwebengine             5.12.1           py39h0fcd23e_7    conda-forge
pyrsistent                0.17.3           py39h3811e60_2    conda-forge
pysocks                   1.7.1            py39hf3d152e_3    conda-forge
python                    3.9.6           h49503c6_1_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.9                      2_cp39    conda-forge
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
pyyaml                    5.4.1            py39h3811e60_0    conda-forge
pyzmq                     22.1.0           py39h37b5a0c_0    conda-forge
qt                        5.12.9               hda022c4_4    conda-forge
readline                  8.1                  h46c0cb4_0    conda-forge
requests                  2.26.0             pyhd8ed1ab_0    conda-forge
requests-unixsocket       0.2.0                      py_0    conda-forge
roctools                  0.0.0                hf484d3e_1    numba
scipy                     1.7.0            py39hee8e79c_1    conda-forge
send2trash                1.7.1              pyhd8ed1ab_0    conda-forge
setuptools                49.6.0           py39hf3d152e_3    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
sniffio                   1.2.0            py39hf3d152e_1    conda-forge
sqlite                    3.36.0               h9cd32fc_0    conda-forge
tbb                       2021.1.1              intel_119    numba
terminado                 0.10.1           py39hf3d152e_0    conda-forge
testpath                  0.5.0              pyhd8ed1ab_0    conda-forge
tk                        8.6.10               h21135ba_1    conda-forge
tornado                   6.1              py39h3811e60_1    conda-forge
traitlets                 5.0.5                      py_0    conda-forge
tzdata                    2021a                he74cb21_1    conda-forge
urllib3                   1.26.6             pyhd8ed1ab_0    conda-forge
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
webencodings              0.5.1                      py_1    conda-forge
websocket-client          0.57.0           py39hf3d152e_4    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.0.10               h7f98852_0    conda-forge
xorg-libsm                1.2.3             hd9c2040_1000    conda-forge
xorg-libx11               1.7.2                h7f98852_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h7f98852_1    conda-forge
xorg-libxrender           0.9.10            h7f98852_1003    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h7f98852_1002    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
yaml                      0.2.5                h516909a_0    conda-forge
zeromq                    4.3.4                h9c3ff4c_0    conda-forge
zipp                      3.5.0              pyhd8ed1ab_0    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
zstd                      1.5.0                ha95c52a_0    conda-forge



Details about conda and system ( conda info ):
$ conda info
     active environment : base
    active env location : /home/user/Documents/Programming/Toolchains/miniconda3
            shell level : 1
       user config file : /home/user/.condarc
 populated config files : 
          conda version : 4.10.3
    conda-build version : not installed
         python version : 3.8.10.final.0
       virtual packages : __linux=5.13.4=0
                          __glibc=2.33=0
                          __unix=0=0
                          __archspec=1=x86_64
       base environment : /home/user/Documents/Programming/Toolchains/miniconda3  (writable)
      conda av data dir : /home/user/Documents/Programming/Toolchains/miniconda3/etc/conda
  conda av metadata url : None
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /home/user/Documents/Programming/Toolchains/miniconda3/pkgs
                          /home/user/.conda/pkgs
       envs directories : /mnt/scratch/user/Programming/Conda/envs
                          /home/user/Documents/Programming/Toolchains/miniconda3/envs
                          /home/user/.conda/envs
               platform : linux-64
             user-agent : conda/4.10.3 requests/2.25.1 CPython/3.8.10 Linux/5.13.4-arch2-1 arch/ glibc/2.33
                UID:GID : 1000:1000
             netrc file : None
           offline mode : False

Create an MKL environment: conda create -n mkl numpy mkl
Create a BLIS environment: conda create -n blis numpy blis nomkl
Create an OpenBLAS environment: conda create -n openblas numpy openblas nomkl

This is not the correct way. Please see our docs on how to switch blas implementation.

What are you talking about?

The point is not how to switch implementations in the most comfortable way (feel free to use whichever method you prefer to switch).

The point is about this OpenBLAS being much slower than BLIS, which is not how things used to be.

The point is not how to switch implementations in the most comfortable way

I didn't say it was comfortable or not. I said it's not correct which means it's wrong. conda list output you showed has the following,

libblas                   3.9.0           5_h92ddd45_netlib    conda-forge
libcblas                  3.9.0           5_h92ddd45_netlib    conda-forge

which means that you are not using openblas and using netlib's reference lapack which is slow. You have both netlib and openblas installed, but numpy is using the netlib one.

Please use the recommended way to switch blas implementation and you'll be able to get an environment where numpy uses openblas.

Why can't openblas require/pull the correct libblas?

Well at least I suppose this solves this specific bug request though it sounds like improper liblas versions should be made to conflict with mismatching BLAS implementations.