explosion/thinc

pip install problem on Apple M1

rfugal opened this issue · 4 comments

I am trying to install thinc on the new Apple M1, and I have been able to successfully install all the prerequisites:
pip install numpy --no-binary :all: --no-use-pep517

successfully installed numpy 1.20.1

BLIS_ARCH="generic" pip install blis --no-binary blis

successfully installed blis 0.7.2

git clone https://github.com/explosion/thinc
pip install -r thinc\requirements.txt

successfully installed all requirements

It seems like I should now be able to pip install thinc, but it fails when it attempts to reinstall numpy and blis. Any ideas?

Sorry, I will close this now. Apparently just had to let pip keep running despite the errors.
CC="clang" pip install thinc

successfully installed thinc 7.4.2 despite the errors

Apparently thinc 8.0.1 fails and pip eventually succeeds on 7.4.2. So CC="clang" pip install thinc==7.4.2 works like a charm

Now I need an older release of SpaCy because 3.0.3 requires thinc>=8.0.0

Any idea what the problem is?

As an initial kind of non-answer, in case you don't really want to compile anything, there are binary packages for OS X arm64 on conda-forge for all recent releases of thinc and spacy (also spacy v2 and thinc v7). You can use the experimental miniforge installer and then conda install spacy should just work (if not using miniforge, add -c conda-forge). Be aware that the packages are cross-compiled and not tested by the conda-forge build process, but I've run all the tests locally without problems.

https://github.com/conda-forge/miniforge


To answer the direct question here: you want to use pip install thinc --no-build-isolation to build thinc in your current venv where you've already gotten numpy installed instead of in an isolated build env (what pip does by default for newer releases of thinc, which is where it's reinstalling numpy from scratch).

If you want to build an editable install from the github repo you'd use:

cd thinc
pip install --no-build-isolation --editable .

The very detailed version:

Unfortunately, installing numpy or packages that require numpy with pip on an apple m1 is a bit of a headache right now because the most recent versions of numpy (1.19.3--1.20.1) require a version of wheel in pyproject.toml (PEP517) that doesn't work on big sur, and numpy and all of our packages are configured to use build isolation by default. (You need wheel>=0.36.1.)

Depending on how you've configured numpy above, you might also have built it without a blas library, which will probably be slow. Here I've installed openblas with homebrew (brew install openblas) and then the install process looks like this:

Add a constraints.txt file to specify an older version of numpy without the wheel problem in the isolated builds:

numpy==1.19.2
source .venv/bin/activate
pip install -U pip setuptools wheel
PIP_CONSTRAINT=constraints.txt OPENBLAS="$(brew --prefix openblas)" pip install spacy

It looks like the next version of numpy (1.21) won't have this problem.

The steps above work with python 3.9 from conda-forge or vanilla python 3.9.2 (from source or the universal2 installer from https://www.python.org/downloads/release/python-392/). python 3.9 from homebrew and the system python 3.8 do not seem to work correctly.

  • With python 3.9.2 from homebrew (something related to a modified setuptools? I am not sure) due to the custom distutils.cfg, pip install thinc works but then pip install spacy fails because a numpy include path isn't getting set correctly. There is a slight difference in how thinc vs. spacy set the numpy include paths so we will modify this for the next release (will be in v3.0.4., see explosion/spaCy#7204).

  • With the system python 3.8 in /usr/bin/python3, I can compile and install numpy as above, but numpy segfaults in the middle of its test suite, so I think vanilla python or python from conda-forge is a better choice.


How to install everything without build isolation:

If you're using python from conda-forge, you might want to go ahead and use their openblas+numpy, too, and then you don't need to install openblas separately. If you want to pip install from source instead of using the conda-forge packages or you want to use numpy 1.20 with pip, you have to go step-by-step through the dependencies that require numpy to install requirements and disable build isolation.

Start with conda numpy:

conda activate spacyvenv
conda install numpy

OR pip numpy:

source .venv/bin/activate
pip install cython
OPENBLAS="$(brew --prefix openblas)" pip install numpy --no-build-isolation

And then the pip install steps for spacy, disabling build isolation for everything that requires numpy:

pip install -r https://raw.githubusercontent.com/explosion/cython-blis/master/requirements.txt
pip install blis --no-build-isolation
pip install -r https://raw.githubusercontent.com/explosion/thinc/master/requirements.txt
pip install thinc --no-build-isolation
pip install -r https://raw.githubusercontent.com/explosion/spaCy/master/requirements.txt
pip install spacy --no-build-isolation

(This whole situation is very unpleasant and frustrating.)

I think this situation is mostly resolved with the release of numpy 1.21.0, so I'll close this issue for now. Feel free to open a new issue if similar problems come up in the future.