conda-forge/intel_repack-feedstock

Pytorch <-> conda-forge MKL incompatibility in Windows

alanhdu opened this issue · 14 comments

I'm not sure exactly where to open this issue, so I'm happy to route this wherever.

Issue: On Windows, it appears that the upstream Pytorch conda package (pytorch::conda-forge) is incompatible with the MKL provided by conda-forge:

$ conda install -c pytorch -c conda-forge pytorch cpuonly
$ python -c "import torch"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\tools\miniconda3\envs\scratch\lib\site-packages\torch\__init__.py", line 81, in <module>
    from torch._C import *
ImportError: DLL load failed: The operating system cannot run %1.

But if we use the defaults::mkl instead (which you can do by forcing conda install intel-openmp, which will move

  mkl                           conda-forge::mkl-2019.5-281 --> pkgs/main::mkl-2019.4-245

then this works just fine.


Environment (conda list):
$ conda list
# packages in environment at C:\tools\miniconda3\envs\scratch:
#
# Name                    Version                   Build  Channel
certifi                   2019.11.28               py36_0    conda-forge
cffi                      1.13.2           py36hb32ad35_0    conda-forge
cpuonly                   1.0                           0    pytorch
libblas                   3.8.0                     8_mkl    conda-forge
libcblas                  3.8.0                     8_mkl    conda-forge
liblapack                 3.8.0                     8_mkl    conda-forge
llvm-openmp               9.0.1                         2    conda-forge
mkl                       2019.5                      281    conda-forge
ninja                     1.10.0               h1ad3211_0    conda-forge
numpy                     1.17.5           py36hc71023c_0    conda-forge
pip                       20.0.2                   py36_0    conda-forge
pycparser                 2.19                     py36_1    conda-forge
python                    3.6.7             he025d50_1006    conda-forge
pytorch                   1.4.0               py3.6_cpu_0  [cpuonly]  pytorch
setuptools                45.1.0                   py36_0    conda-forge
vc                        14.1                 h0510ff6_4  
vs2015_runtime            14.16.27012          hf0eaf9b_1  
wheel                     0.34.1                   py36_0    conda-forge
wincertstore              0.2                   py36_1003    conda-forge

Details about conda and system ( conda info ):
$ conda info
     active environment : scratch
    active env location : C:\tools\miniconda3\envs\scratch
            shell level : 3
       user config file : C:\Users\circleci\.condarc
 populated config files : C:\tools\miniconda3\.condarc
          conda version : 4.6.14
    conda-build version : not installed
         python version : 3.7.3.final.0
       base environment : C:\tools\miniconda3  (writable)
           channel URLs : https://conda.anaconda.org/conda-forge/win-64
                          https://conda.anaconda.org/conda-forge/noarch
                          https://repo.anaconda.com/pkgs/main/win-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/free/win-64
                          https://repo.anaconda.com/pkgs/free/noarch
                          https://repo.anaconda.com/pkgs/r/win-64
                          https://repo.anaconda.com/pkgs/r/noarch
                          https://repo.anaconda.com/pkgs/msys2/win-64
                          https://repo.anaconda.com/pkgs/msys2/noarch
          package cache : C:\tools\miniconda3\pkgs
                          C:\Users\circleci\.conda\pkgs
                          C:\Users\circleci\AppData\Local\conda\conda\pkgs
       envs directories : C:\tools\miniconda3\envs
                          C:\Users\circleci\.conda\envs
                          C:\Users\circleci\AppData\Local\conda\conda\envs
               platform : win-64
             user-agent : conda/4.6.14 requests/2.21.0 CPython/3.7.3 Windows/10 Windows/10.0.17763
          administrator : True
             netrc file : None
           offline mode : False

Can you install mkl 2019.5 from defaults and check if the issue is still there? conda install "pkgs/main::mkl=2019.5"

@isuruf: Running that just gives me a PackagesNotFoundError

conda install --no-channel-priority "pkgs/main::mkl=2019.5"
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - pkgs/main::mkl=2019.5

Current channels:

  - https://conda.anaconda.org/conda-forge/win-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://repo.anaconda.com/pkgs/main/win-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/win-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://repo.anaconda.com/pkgs/msys2/win-64
  - https://repo.anaconda.com/pkgs/msys2/noarch

Sames with trying to use defaults::mkl

Can you try anaconda::mkl?

anaconda::mkl=2019.5 works for me! Maybe it's an llvm-openmp vs intel-openmp issue?

> conda install anaconda::mkl
The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    intel-openmp-2019.4        |              245         1.4 MB
    mkl-2019.5                 |              281       158.3 MB  anaconda
    ------------------------------------------------------------
                                           Total:       159.7 MB

The following NEW packages will be INSTALLED:

  intel-openmp       pkgs/main/win-64::intel-openmp-2019.4-245

The following packages will be SUPERSEDED by a higher-priority channel:

  mkl                                           conda-forge --> anaconda

> python -c "import torch"

Can you try using the conda-forge MKL and then manually replacing C:\tools\miniconda3\envs\scratch\bin\libiomp5.dll with the one from intel-openmp package?

Hm... I stand corrected. After mucking around again, it looks like this is actually failing with anaconda::mkl=2019.5 -- when I tried again, it failed.

I increasingly believe this is an intel-openmp issue: if you take a failing environment, installing intel-openmp makes the issue go away. (Previously when I tried to conda install anaconda::mkl, it also installed intel-openmp which is why it passed).

For example this fails:

# packages in environment at
C:\Users\ndanielson\AppData\Local\Continuum\miniconda3\envs\test:
#
# Name                    Version                   Build  Channel
certifi                   2019.11.28               py36_0    conda-forge
cffi                      1.13.2           py36hb32ad35_0    conda-forge
cpuonly                   1.0                           0    pytorch
libblas                   3.8.0                     8_mkl    conda-forge
libcblas                  3.8.0                     8_mkl    conda-forge
liblapack                 3.8.0                     8_mkl    conda-forge
llvm-openmp               9.0.1                         2    conda-forge
mkl                       2019.5                      281    conda-forge
ninja                     1.10.0               h1ad3211_0    conda-forge
numpy                     1.17.5           py36hc71023c_0    conda-forge
pip                       20.0.2                   py36_0    conda-forge
pycparser                 2.19                     py36_1    conda-forge
python                    3.6.7             he025d50_1006    conda-forge
pytorch                   1.4.0               py3.6_cpu_0  [cpuonly]  pytorch
setuptools                45.1.0                   py36_0    conda-forge
vc                        14.1                 h0510ff6_4
vs2015_runtime            14.16.27012          hf0eaf9b_1
wheel                     0.34.1                   py36_0    conda-forge
wincertstore              0.2                   py36_1003    conda-forge

but this works:

# packages in environment at
C:\Users\ndanielson\AppData\Local\Continuum\miniconda3\envs\test:
#
# Name                    Version                   Build  Channel
certifi                   2019.11.28               py36_0    conda-forge
cffi                      1.13.2           py36hb32ad35_0    conda-forge
cpuonly                   1.0                           0    pytorch
intel-openmp              2019.4                      245
libblas                   3.8.0                     8_mkl    conda-forge
libcblas                  3.8.0                     8_mkl    conda-forge
liblapack                 3.8.0                     8_mkl    conda-forge
llvm-openmp               9.0.1                         2    conda-forge
mkl                       2019.5                      281    conda-forge
ninja                     1.10.0               h1ad3211_0    conda-forge
numpy                     1.17.5           py36hc71023c_0    conda-forge
pip                       20.0.2                   py36_0    conda-forge
pycparser                 2.19                     py36_1    conda-forge
python                    3.6.7             he025d50_1006    conda-forge
pytorch                   1.4.0               py3.6_cpu_0  [cpuonly]  pytorch
setuptools                45.1.0                   py36_0    conda-forge
vc                        14.1                 h0510ff6_4
vs2015_runtime            14.16.27012          hf0eaf9b_1
wheel                     0.34.1                   py36_0    conda-forge
wincertstore              0.2                   py36_1003    conda-forge

Thanks @alanhdu, I've moved the MKL builds from conda-forge to broken at the moment and transferred the issue to the correct repository.

Thanks so much!

In terms of further debugging, which .dll should I replace? I see a ton of dlls that could possibly be right:

  • Library\bin\libimop5md.dll
  • Library\bin\libomp.dll
  • Library\bin\libiompstubs5md.dll

(sorry if this is a dumb question -- Linux is basically the only OS I know how to operate and so I'm totally clueless about Windows stuff).

It's Library\bin\libiomp5md.dll from the one in intel-openmp. (Not a dumb question at all, Windows is hard and is sometimes alien to those of us with Linux background.)

@isuruf: Sorry for the delay! I just checked, and it seems that cp-ing the libiomp5md.dll from an environment with libiomp5md.dll does make the Pytorch import work for an environment for conda-forge/label/broken::mkl=2019.5

Thanks @alanhdu. This issue is not there in Linux and OSX. I need to look into how DLLs work. I hate Windows.

Hello @isuruf, has there been any progress made on this issue? I'm seeing the exact same on a Windows machine of mine when trying to use PyTorch from conda-forge :(

@hoechenberger, please open a new issue with more details

@isuruf Thank you, will do once I've collected more information! :)