nextstrain/docs.nextstrain.org

Native install of nextstrain tool suite sourced from Conda is failing on Mac Pro with Apple M1 chip

alliblk opened this issue ยท 13 comments

Current Behavior

Hey nextstrain friends :) I got a new computer (2021 Macbook Pro with the Apple M1 chip) and wanted to set up my nextstrain environment. Following instructions on the docs page, I installed Miniconda (64 bit for Apple M1 chip pkg). Got the base environment running no problem. Given the docs' warning about docker slowness on the M1 chips, I decided to go with the native install.

Within the activated base environment, I ran:

mamba create -n nextstrain \
  -c conda-forge -c bioconda \
  nextstrain-cli augur auspice nextalign snakemake git \
  --yes

Which failed with the following error:

Looking for: ['nextstrain-cli', 'augur', 'auspice', 'nextalign', 'snakemake', 'git']
conda-forge/osx-arm64    Using cache
conda-forge/noarch       Using cache
bioconda/osx-arm64       Using cache
bioconda/noarch          Using cache
pkgs/main/osx-arm64      [====================] (00m:00s) No change
pkgs/r/osx-arm64         [====================] (00m:00s) No change
pkgs/r/noarch            [====================] (00m:00s) No change
pkgs/main/noarch         [====================] (00m:00s) No change
Encountered problems while solving:
nothing provides requested auspice
nothing provides requested nextalign
nothing provides mafft needed by augur-10.0.0-py_0

I wondered whether updating conda might fix this, so I ran conda update conda on the base environment, and repeated the above mamba install command, which failed again with the same error. Then I wondered whether it might be an issue with mamba, so I instead tried just using conda to install augur (not the whole suite), running conda install -c conda-forge -c bioconda augur. This command also failed, with the following error:

Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: / 
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed                                                                                                                      

UnsatisfiableError: 

Expected behavior

This is the first time I've had a local Nextstrain install fail. I thought I would just follow the commands in the documentation, and that I'd build a Nextstrain environment that I could just activate and use.

Possible solution

In the "olden days" there was a public facing URL for the nextstrain yaml file specifying all the requirements of the conda environment. I'm not seeing that in the updated documentation anywhere. If that is still around, any chance I could try out the old install method to see if that works? Alternatively I can totally try installing augur from source, and auspice using the old npm install instructions, but I'm imagining that if I'm having this issue, other people are possibly having issues as well, and so I thought you should know. I'd love to have it all done neatly with mamba!

Your environment: if running Nextstrain locally

  • Operating system: MacOS Monterrey on Macbook Pro Apple M1
  • Browser: Chrome
  • Version (e.g. auspice 2.7.0): can't get installed, so no version.
  • note: I didn't think that this would matter, but on the off change that it did, I changed my terminal from zsh to bash but still continued to have the same error with installation attempts.

Additional context

Thank you for helping me out with this!

Hi @alliblk, thanks for bringing this up! I can confirm/reproduce this issue, and apologies for not catching it earlier โ€“ I was unknowingly running an Intel-based conda version on my M1 Mac!

Can you try creating an intermediate environment base_osx-64 using these commands? (from conda-forge/miniforge#165 (comment))

CONDA_SUBDIR=osx-64 conda create -n base_osx-64 python   # create a new environment called base_osx-64 with intel packages.
conda activate base_osx-64
conda config --env --set subdir osx-64  # make sure that conda commands in this environment use intel packages

Then run the mamba create command you tried earlier.

@victorlin Using osx-64 will mean that emulation is used and thus will be slower, right? If so, then until osx-arm64 is supported it might be easier to install via the Docker runtime instead? (Although maybe Conda without M1 support is still faster than Docker without M1 support for other reasons?)

@tsibley the benchmark I did over at nextstrain/docker-base#35 (comment) was actually with the emulation in native install, still much faster than Docker emulation. I'd vote to keep the current recommendation.

@victorlin can confirm that this worked, thank you! Was able to make the environment, and I've checked versioning on augur, auspice and nextstrain-cli (all good). Thanks for the prompt help :)

Just wondering, is there a way to skip the intel-based stuff in the future? Or is this baked into the M1 chip setup?

@alliblk awesome, glad it works now!

In terms of working with Nextstrain tools, in the near future I don't think there will be any way to support M1 chips without this workaround for the same reasons described in nextstrain/cli#14 (comment), except here we are looking for arm64 platform instead of Windows.

Luckily, the Intel-based emulation workaround is easy, should be a one-time setup, and performs quite well based on my personal use.

I'll keep this issue open and close with a PR that notes the workaround in our install docs.

Could this be a mamba-specific problem? I use a M1 mac and a native (ambient?) runtime and have not had any issues. The following works out of the box for me:

$ conda --version
conda 4.12.0
$ conda config --get subdir # no output - nothing set
$ conda config --system --get subdir # no output - nothing set

$ conda create -n test   -c conda-forge -c bioconda   nextstrain-cli augur auspice nextalign snakemake git   --yes
...

$ conda activate test

$ augur --version
augur 15.0.2
$ mafft --version
v7.505 (2022/Apr/10)

@jameshadfield could you run python -c "import platform;print(platform.machine())" from that conda environment and check the output?

It's possible that you've installed conda using the osx-64 installer so the whole thing is running emulated. If that's the case, maybe the subdir conda config wouldn't need to be set explicitly.

$ python -c "import platform;print(platform.machine())"
x86_64

So yes -- I'm running things via rosetta :(

@jameshadfield I was doing the same without knowing it! I re-installed conda with the arm64 installer via brew. I also had to recreate conda environments, but it wasn't difficult with the existing commands I had saved. If you don't have commands handy, you can use something like conda env export to save existing environments.

Now that I have both osx-64 and osx-arm64 environments, a quick comparison shows there is a noticeable speed up โ€“ an augur filter call that takes 18s on osx-64 takes 11s on osx-arm64! Though, there are still many packages that can't be installed on osx-arm64 so usage can only go so far.

Adding notes here from my experience reinstalling environments using arm64 conda.

  • iqtree can't be found (in the bioconda) channel. Using the pre-build binary from http://www.iqtree.org/#download works, although iqtree itself will run under x86 emulation
  • there are arm-compiled MacOS builds for nextclade & nextalign available from https://github.com/nextstrain/nextclade/releases
  • pip wouldn't install scipy, but conda would (didn't investigate the reasons for this, but I only encountered it when switching to arm64 so I presume it's related)
  • mafft was the worst. I ended up building it from source and changing the Makefile to install it into the conda environments /bin directory (see here for more).

@jameshadfield good to know! Curious if you have/will notice any performance difference between osx-64 and osx-arm64.

For anyone else who's running into this, another recommendation is to just install Conda using the non-native Intel version from the Miniconda installation page (look for any link containing Miniconda3 macOS Intel x86 64-bit). This means emulation will be used for everything conda-related.

  • Pro: No need to worry about any osx-64 vs. osx-arm64 nuances. You should be able to install anything you can normally install with an Intel-based mac.
  • Con: Conda packages that are available for osx-arm64/noarch will run a bit slower

In my opinion, the pros far outweigh the cons, especially for bioinformaticians who want to use tools that don't have native M1 support (e.g. MAFFT).

A curated summary of this discussion is now served as a section in the docs FAQ page.