bioconda/bioconda-utils

Support other architectures also supported by conda-forge (e.g. Mac arm64)

milot-mirdita opened this issue ยท 25 comments

I've heard from a user of my software that they are using it through bioconda on a Mac M1. It would be great if they could use native bioconda packages for ARM64/Mac M1.

Conda-forge seems to support the following platforms (https://github.com/conda-forge/miniforge):

  • Linux | x86_64 (amd64)
  • Linux | aarch64 (arm64)
  • Linux | ppc64le (POWER8/9)
  • OS X | x86_64
  • OS X | arm64 (Apple Silicon)
  • Windows | x86_64

It would be nice to be able to offer prebuilt bioconda packages for all of these platforms (except Windows I guess). The recipes I maintain would support all of these architectures already.

Personally I would be delighted if BioConda supported Windows, even if initially only a tiny fraction of the recipes actually took advantage of this. I can think of a few bioinformatics recipes which have moved into conda-forge purely for Windows support. However, I take the point that since Windows 10 WSL this is less and less important.

The new Apple arm64 is probably going to be more popular with bioinformaticians given how many of us had a Mac laptop... ;)

My use case for arm64 is the Oxford Nanopore Mk1C (aarch64). I recognise that supporting this is non-trivial but being able to install bioinformatics software on the Mk1C would be very useful as, while it is not a very powerful machine, it is good enough to do quite a lot of bioinformatics "at the point of sequencing".

Just a note that we'd love seeing M1 support for a number of applications, and would be happy to assist where possible in testing. One in particular is anvi'o, which has several bioconda dependencies.

We're in discussion with Amazon regarding getting AWS credits to make this possible.

@dpryan79 is it a matter of ongoing need for AWS credits or more of a one-time thing to get everything up and going and support building all the existing packages/recipes for Apple Silicon?

Is there a funding mechanism for bioconda? E.g. a patreon or gofundme or github sponsorship? As more powerful apple silicon laptops are now showing up in lots of labs and companies, I suspect many people and companies would be willing to kick in some funds to help make this happen. I know my company (I'm part owner) certainly would.

@tfenne It's more a matter of on-going credits. We have some discussions under way with AWS, but I'm not privy to the details on that.

DrYak commented

Aarch64 support is definitely something that would interest me (I have a lightweight chromebook-class Linux laptop that is powered by an Aarch64 chip and runs Manjaro ARM).

Regarding Windows:
WSL2 has become good enough to be a working solution for our users for howto/tutorials.
Given that, and given how much bioinformatics software tend to be clusters- and unices- oriented, I suspect the bioconda team would prefer not to spend too much resource into that?

Yeah, windows support is a notably lower priority. It would be a real pain to get most things compiling outside of WSL, but a few things are supposed to support windows and the lack of available packages that we produce for that platform has become a problem for them.

Any progress here? Since the last mention of discussions with Amazon it's been 6 months.

Maybe some other infrastructure may have become free to use in the meantime?

I got my first apple silicon macbook just now - I suspect as more and more people switch the momentum for ARM support will grow. And the price of build resources will drop (maybe?).

Any progress here? Since the last mention of discussions with Amazon it's been 6 months.

Maybe some other infrastructure may have become free to use in the meantime?

From the bioconda-recipes ticket on this topic:

osx-arm64 support is on the roadmap for GitHub actions! :octocat: More discussion here.

kiwik commented

I use conda and conda-forge in my Linux aarch64 VM and server, would like to see that bioconda can support Linux aarch64 too, therefore I could feel free to choose reasonable Linux aarch64 packages from these channels.

+1 for adding support for Linux ARM64!
We moved many of our cloud deployments to ARM64 because it is cheaper and environmentally friendlier!

Support for Linux ARM64 is much needed!
How can we help with this task ?

Hello!
I am also interested in getting bioconda packages working on Linux ARM64, so I tried

$ conda install --file bioconda_utils/bioconda_utils-requirements.txt -c conda-forge -c bioconda

on Linux openEuler 22.03 aarch64.

Initially it failed with these missing packages:

PackagesNotFoundError: The following packages are not available from current channels:

  - beautifulsoup4=4.6
  - regex==2018.08.29
  - involucro=1.1
  - colorlog=3.1
  - jsonschema=2.6
  - skopeo==0.1.35
  - pandas=0.23

After updating some of them:

diff --git a/bioconda_utils/bioconda_utils-requirements.txt b/bioconda_utils/bioconda_utils-requirements.txt
index a796722..f07189f 100644
--- a/bioconda_utils/bioconda_utils-requirements.txt
+++ b/bioconda_utils/bioconda_utils-requirements.txt
@@ -9,16 +9,16 @@ boa=0.9.*
 conda-build=3.21.8
 conda-verify=3.1.*
 argh=0.26.*          # CLI
-colorlog=3.1.*       # Logging
+colorlog=4.8.*       # Logging
 tqdm>=4.26           # Progress monitor
 ruamel_yaml=0.15.*   # Recipe YAML parsing
 pyaml=17.12.*        # Faster YAML parser (deprecate?)
 networkx=2.*
-pandas=0.23.*
+pandas=1.2.*
 numpy=1.19.*         # Avoid breaking pandas on OSX
 libblas=*=*openblas  # Avoid large mkl package (pulled in by pandas)
 boltons=18.*
-jsonschema=2.6.*     # JSON schema verification
+jsonschema=3.2.*     # JSON schema verification
 simplejson           # Used by bioconda bot worker (NEEDED?)
 pyopenssl>=22.1      # Stay compatible with cryptography
 
@@ -32,7 +32,7 @@ skopeo=0.1.35          # docker upload
 git=2.*                # well - git
 
 # hosters - special regex not supported by RE
-regex=2018.08.29
+regex=2022.7.9
 
 # asyncio
 aiohttp=3.8.*      # HTTP lib
@@ -51,7 +51,7 @@ gidgethub=3.0.*           # githubhandler
 pyjwt>=2.4.0              # githubhandler (JWT signing), needs >=2.4.0, CVE-2022-29217
 
 # unknown
-beautifulsoup4=4.6.*
+beautifulsoup4=4.8.*
 galaxy-lib>=18.9.1
 jinja2>=2.10.1,<3
 markupsafe<2.1           # markupsafe 2.1 breaks jinja2
@@ -69,4 +69,4 @@ graphviz
 requests=2.22.*
 
 # merge handling
-pygithub
\ No newline at end of file
+pygithub

Now the only missing packages are:

PackagesNotFoundError: The following packages are not available from current channels:

  - involucro=1.1
  - skopeo==0.1.35

https://anaconda.org/search?q=involucro says that this package is available only for linux-64 and osx-64.
https://anaconda.org/search?q=skopeo is available for linux-64, osx-64 and osx-arm64. Update: filed an issue

I will try to find out what would be needed to get those two packages working on linux-arm64 but any help would be very welcome!

I've filled an issue for skopeo.

But involucro seems more complicated since it is part of Bioconda itself. How can we bootstrap it ?

Yikun commented

It seems involucro and skopeo is not required in build stage, I try to install bioconda-utils manually and found another issue.

When I try to build a package:

bioconda-utils build --docker --mulled-test --packages bioconductor-a4 --force

The build stage depends on a docker images which build in here and use blow Dockerfile:

FROM quay.io/condaforge/linux-anvil-cos7-x86_64 as base

Unfortunately, it only support the x86 now.

Let's make it a build-arg, so one can use https://quay.io/repository/condaforge/linux-anvil-aarch64 or others if needed ?

Yikun commented

Just to share my trial to support bioconda-utils on Linux aarch64.

1. Support bioconda-utils docker image for Linux aarch64

  • Apply patch for aarch64 Dockerfile
diff --git a/Dockerfile b/Dockerfile
index 251af12..cb6bad6 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,4 +1,4 @@
-FROM quay.io/condaforge/linux-anvil-cos7-x86_64 as base
+FROM quay.io/condaforge/linux-anvil-aarch64 as base

 # Copy over C.UTF-8 locale from our base image to make it consistently available during build.
 COPY --from=quay.io/bioconda/base-glibc-busybox-bash /usr/lib/locale/C.UTF-8 /usr/lib/locale/C.UTF-8
  • Build a local image
docker build -t  quay.io/bioconda/bioconda-utils-build-env-cos7:aarch64 .

2. Support bioconda-utils to use custom docker base image

  • Add --docker-base-image
diff --git a/bioconda_utils/cli.py b/bioconda_utils/cli.py
index c723ff7..aa88de1 100644
--- a/bioconda_utils/cli.py
+++ b/bioconda_utils/cli.py
@@ -422,12 +422,15 @@ def do_lint(recipe_folder, config, packages="*", cache=None, list_checks=False,
      than one worker, then make sure to give each a different offset!''')
 @arg('--keep-old-work', action='store_true', help='''Do not remove anything
 from environment, even after successful build and test.''')
+@arg('--docker-base-image', help='''Name of base image that can be used in\
+**dockerfile_template**.''')
 @enable_logging()
 def build(recipe_folder, config, packages="*", git_range=None, testonly=False,
           force=False, docker=None, mulled_test=False, build_script_template=None,
           pkg_dir=None, anaconda_upload=False, mulled_upload_target=None,
           build_image=False, keep_image=False, lint=False, lint_exclude=None,
-          check_channels=None, n_workers=1, worker_offset=0, keep_old_work=False):
+          check_channels=None, n_workers=1, worker_offset=0, keep_old_work=False,
+          docker_base_image='quay.io/bioconda/bioconda-utils-build-env-cos7:{}'.format(VERSION.replace('+', '_'))):
     cfg = utils.load_config(config)
     setup = cfg.get('setup', None)
     if setup:
@@ -453,6 +456,7 @@ def build(recipe_folder, config, packages="*", git_range=None, testonly=False,
             use_host_conda_bld=use_host_conda_bld,
             keep_image=keep_image,
             build_image=build_image,
+            docker_base_image=docker_base_image
         )
     else:
         docker_builder = None

3. Support build script template for Linux aarch64

diff --git a/bioconda_utils/docker_utils.py b/bioconda_utils/docker_utils.py
index a43d864..0edbc60 100644
--- a/bioconda_utils/docker_utils.py
+++ b/bioconda_utils/docker_utils.py
@@ -443,7 +443,7 @@ class RecipeBuilder(object):
         # Write build script to tempfile
         build_dir = os.path.realpath(tempfile.mkdtemp())
         script = self.build_script_template.format(
-            self=self, arch='noarch' if noarch else 'linux-64')
+            self=self, arch='noarch' if noarch else 'linux-aarch64')
         with open(os.path.join(build_dir, 'build_script.bash'), 'w') as fout:
             fout.write(script)
         build_script = fout.name

There were a hard code in build script tempfile, I just fix it temperary, but I haven't think it out how should we support it gracefully.

4. Build a package with --docker-base-image specified:

$ bioconda-utils build --docker --packages py2bit --docker-base-image quay.io/bioconda/bioconda-utils-build-env-cos7:aarch64 --force

18:52:04 BIOCONDA INFO (OUT) Total time: 0:07:35.5
18:52:04 BIOCONDA INFO (OUT) CPU usage: sys=0:00:00.8, user=0:00:02.2
18:52:04 BIOCONDA INFO (OUT) Maximum memory usage observed: 35.3M
18:52:04 BIOCONDA INFO (OUT) Total disk usage observed (not including envs): 1.8K
Subdir: noarch:  67% 2/3 [00:00<00:00, 21.70it/s]95it/s]

18:52:13 BIOCONDA INFO BUILD SUCCESS py2bit-0.3.0-py37hc656d7e_7.tar.bz2 py2bit-0.3.0-py36h47cae77_7.tar.bz2 py2bit-0.3.0-py310hc420da2_7.tar.bz2 py2bit-0.3.0-py27h2dda9c8_7.tar.bz2 py2bit-0.3.0-py38h14a15a9_7.tar.bz2 py2bit-0.3.0-py39h11ab828_7.tar.bz2
18:52:13 BIOCONDA INFO (COMMAND) conda build purge
18:52:14 BIOCONDA INFO BUILD SUMMARY: successfully built 1 of 1 recipes

Build successfully and I can see built pkg under the ~/miniconda/envs/bioconda/conda-bld/linux-aarch64/ path.

BTW, when I try to build involucro on Linux aarch64, it was failed same way with x86_64: bioconda/bioconda-recipes#40063.

Yikun commented

@dpryan79 Would you mind giving some more idea about how we can support bioconda on Linux aarch64? Maybe we can donate some Linux aarch64 machines to build and test each package.

Just FYI: we have cooperated with Bioconductor community to validate/test and fix error for Bioconductor on Linux aarch64[1][2], and have had a pretty good progress.

[1] Bioconductor/BBS#255 (comment)
[2] https://yikun.github.io/bioconductor-0301/report/long-report.html

Yikun commented

Sync some info about our progress to support linux aarch64 bioconda-utils:

First, we supported the bioconda-utils deps on linux aarch64 (Resolved issue mentioned in #706 (comment)):

Then, we added the Linux aarch64 support for bioconda-utils and build/release Linux aarch64 image: #866 . After this patch we can build bioconda pkg on Linux aarch64!

I just opened bioconda/bioconda-containers#55 - a PR that updates the base images to be multi-arch (linux/amd64 and linux/arm64/v8).
Any feedback is welcome!

By the way the Bioconductor project recently added Linux ARM64 to their list of supported architectures (experimentally):

dgorhe commented

What are the steps needed to resolve this issue and specifically to get support for M-series Macs (i.e. arm64)? I have the following incompatibilities when trying to run the setup commands in the README.md file on my M2 Mac.

The following packages are incompatible
โ”œโ”€ beautifulsoup4 4.8**  does not exist (perhaps a typo or a missing channel);
โ”œโ”€ involucro 1.1**  does not exist (perhaps a typo or a missing channel);
โ””โ”€ requests 2.22**  does not exist (perhaps a typo or a missing channel).

With the following changes, I can install everything except involucro

diff --git a/bioconda_utils/bioconda_utils-requirements.txt b/bioconda_utils/bioconda_utils-requirements.txt
index db098c3..1448e3b 100644
--- a/bioconda_utils/bioconda_utils-requirements.txt
+++ b/bioconda_utils/bioconda_utils-requirements.txt
@@ -51,7 +51,7 @@ gidgethub=3.0.*           # githubhandler
 pyjwt>=2.4.0              # githubhandler (JWT signing), needs >=2.4.0, CVE-2022-29217
 
 # unknown
-beautifulsoup4=4.8.*
+beautifulsoup4=4.12.*
 galaxy-lib>=18.9.1
 jinja2>=2.10.1,<3
 markupsafe<2.1           # markupsafe 2.1 breaks jinja2
@@ -66,7 +66,7 @@ markdown
 graphviz
 
 # The bioconductor skeleton needs this
-requests=2.22.*
+requests=2.28.*
 
 # merge handling
 pygithub
@@ -76,4 +76,4 @@ diskcache =5.*
 appdirs =1.*
 
 # build failure output
-tabulate =0.9
\ No newline at end of file
+tabulate =0.9

Here's the output from trying to create a conda environment with the changes:

The following package could not be installed
โ””โ”€ involucro 1.1**  does not exist (perhaps a typo or a missing channel).

involucro doesn't have any versions on conda-forge for arm64. My understanding is that modifying the conda-forge recipe for involucro to include arm64 builds would resolve this issue. Is that correct? I'm quite new to open source so I apologize if this is a trivial or annoying question.

Yikun commented

@dgorhe

For requirements missing, we should bump requirement version to M1 suppprted version.

For involucro missing, we should supported M1 arch in conda-forge, just need changes like we supported in linux aarch64 (See also: bioconda/bioconda-recipes#40144 (comment), the involucro has been moved to conda-forge channel)

conda-forge/staged-recipes#22518