Support ARM architecture (multi-arch images)

Question

Support ARM architecture (multi-arch images)

jiwidi opened this issue 5 years ago · 30 comments

Hi!

I was working recently with some Docker images for jupyterlab to run in raspberry pis and I was wondering if an image like that one could be worth including in this notebook or if its of value for the Jupyter team. I will very happy to work towards a contribution for this but first want to check wether there is value for it.

Thanks!

Update by Erik

#1399 has been merged - we have published a few amd64 and arm64 compatible images!

Remaining work

Images to be made arm64 compatible
- base-notebook
  - minimal-notebook
    - r-notebook
    - scipy-notebook
      - tensorflow-notebook
      - datascience-notebook
      - pyspark-notebook
        
        all-spark-notebook
#1401
#1402
#1407

Update by Ayaz

scipy-notebook is waiting for conda-forge/bottleneck-feedstock#35
scipy-notebook and r-notebook are ready now to be built #1444
I will wait for the PR and update the tree above.

Update: scipy-notebook and r-notebook are now available for arm!

Answer 1 · 2020-02-22T14:56:53.000Z

@jiwidi Thanks for your interest in extending the Jupyter ecosystem. #899 has some discussion and movement toward supporting rpi images as community-maintained docker stacks. Perhaps you'd be interested in contributing to that effort?

Answer 2 · 2020-02-22T15:14:27.000Z

@jiwidi Thanks for your interest in extending the Jupyter ecosystem. #899 has some discussion and movement toward supporting rpi images as community-maintained docker stacks. Perhaps you'd be interested in contributing to that effort?

I think its a good idea if we both join forces, I left him a comment and lets see how it goes :)

Thanks for the referal!

Answer 3 · 2020-03-22T15:18:22.000Z

@jiwidi don't these images run there?

What's the issue? pip takes too long compiling? (add piwheels)

Answer 4 · 2020-03-25T07:59:21.000Z

@jiwidi don't these images run there?

What's the issue? pip takes too long compiling? (add piwheels)

What do you mean? If you mean if the normal images can be run in raspberry pi, no they can't. Because of the ARM architecture of rpis there is need for some special tricks or bypass to compile nodejs and other libraries

Answer 5 · 2020-03-25T09:03:08.000Z

I've started looking into it and the first obvious problem is that base-notebook uses miniconda x86_64 explicitly. https://github.com/jupyter/docker-stacks/blob/master/base-notebook/Dockerfile#L79

Latest miniconda armv7 is 2015-08-24 11:01:14 no good. https://repo.continuum.io/miniconda/

So, yeah, making this image "platform independent" means rebuilding on ubuntu packages or pip (pip is fragile, is fragile a lot.)

Answer 6 · 2020-06-09T11:40:52.000Z

You can check my work here https://hub.docker.com/repository/docker/step21/jupyter-minimal-notebook (base and minimal so far, soon also scipy-notebook once scikit-image gets merges)
It has both aarch64 and armv7l images, but unless someone gets armv7l packages into conda-forge, I do not think it makes sense to maintain armv7l images.

Answer 7 · 2021-03-12T14:26:58.000Z

I'm interested this topic too.
aarch64 docker images builded with my repository are here.

Also my docker hub repositories contain all needed by zero-to-jupyter-hub(Jupyter Hub for kubernetes, Z2JH) images.
Check out them prefixed as jupyterhub-k8s-.

I'm using Z2JH on kubernetes, Raspberry Pi 4(8GB) x4 cluster installed Ubuntu 20.04 64bit(arm64/aarch64). It seems work well for now.

aarch64 images can build on GitHub Actions. (but very slow)
My GitHub repository includes for building aarch64 images.
https://github.com/sakuraiyuta/docker-stacks/blob/fix/aarch64/.github/workflows/docker-aarch64.yml

I expected for maintainer team releases aarch64 images officially.
If you have an interest in this topic, I create Pull-Request.

See also:
https://discourse.jupyter.org/t/ztjh-on-a-raspberry-pi-k8s-cluster/3043

Answer 8 · 2021-03-16T14:05:15.000Z

Cool! So far I had them built on docker hub after testing locally.

Answer 9 · 2021-04-17T07:22:43.000Z

Hello, if you don't mind I will rename this issue to something more general like "Support ARM architecture".
Please also note that I made an attempt in #1202 to build multi-arch images. It works however the build time is a pain point in this case.

Answer 10 · 2021-04-17T15:08:33.000Z

Sure, please do. Cool that you did that. What did you use for building that the build time was such an issue? I just used dockerhub and that seemed to work relatively quickly. And how big is the difference?

Answer 11 · 2021-04-26T19:24:04.000Z

@step21 in fact here we are testing the images before before pushing them. In the case of multi-arch images we built an image for each architecture and tested them separately. Everything is in the PR but it takes ~3x more time to build the minimal-notebook for the 3 architectures because nothing was done in parallel.

Answer 12 · 2021-05-01T14:06:35.000Z

How about splitting it? Would be happy to help, but right now I cannot work on it.

Answer 13 · 2021-05-02T16:41:57.000Z

In the JupyterHub organization on GitHub, we have now published (thanks @manics!!!) arm64/aarch64 compatible images alongside the amd64/x86_64 for multiple repositories.

I would like to help this repo do the same, but at the same time I find it crucial that future maintenance is sustainable. I created the enhancement proposal below with that in mind.

Enhancement proposal

Principle - to maintain only a single Dockerfile per image (done by #1290)
Action - to remove the dedicated and outdated ppc64le Dockerfiles and .patch files (done by #1290)
Implementation detail - maintain a list of the images' compatible architectures
Action - update the Makefile to be able to, in an opt-in fashion, to build the same Dockerfile for all compatible architectures (like in: Z2JH, JupyterHub, ConfigurableHTTPProxy)

Current arm64/aarch64 compatibility status

I went ahead and tried building all Dockerfile's (no patches applied etc) with --platform linux/arm64 using docker buildx instead of docker and created this arm64 compatibility list.

base-notebook
- --build-arg mambaforce_arch=aarch64 --build-arg mambaforge_checksum=... also passed
- @... suffix in FROM ubuntu statement removed
minimal-notebook
r-notebook
conda can't resolve/install r packages - a fix to this may be to use --channel conda-forge based on insights shared by @skumagai
scipy-notebook
tensorflow-notebook
conda can't install tensorflow of pinned version
datascience-notebook
- hardcoded x86_64 installation of julia
- conda can't resolve/install r packages (rpy2=3.4 is causing issues I think) - a fix to this may be to use --channel conda-forge based on insights shared by @skumagai
pyspark-notebook
all-spark-notebook

Answer 14 · 2021-05-03T12:04:46.000Z

@consideRatio r-* packages exists on conda-forge repository.
It's better if the project policy can agree to use another(not default) repository.

See also:
https://github.com/sakuraiyuta/docker-stacks/blob/fix/aarch64/r-notebook/Dockerfile.aarch64

Answer 15 · 2021-05-03T12:15:05.000Z

@sakuraiyuta can you clarify if I understood you correctly:

You meant that the conda-forge conda channel includes both amd64 and arm64 compatible versions of r-* packages, while the default conda channel only includes amd64 compatible versions of r-* packages. And, due to this, we should consider switching to using conda-forge conda channel by default instead of the default conda channel by default?

Answer 16 · 2021-05-03T12:59:50.000Z

@consideRatio Sorry, I don't read your comments carefully so I wrote simply a solution to resolve r-* packages.
After reading your reply, I understood that you tried building with applying no patches to Dockerfile.

This project seems already supporting the patch process for another architecture.
So, on my forked repository, I created Dockerfile.aarch64.patch and patched for aarch64 .

As you said, some python packages for aarch64/arm64 are not found on default conda repositories.
It means some notebook images need to switch to another channel for supporting aarch64 or wait default channel supports it.
Maybe, the simplest way is adding -c argument to conda/mamba default install commands.
In other words, you completely understood my comment.
(Sorry, I'm not clear at English)

But, strictly say, it means that changes what the project serves, supports, and testing.
I think we need to consider that this approach is really appropriate.

Answer 17 · 2021-05-03T18:13:30.000Z

@sakuraiyuta and @consideRatio since the PR #1189 we have switched to miniforge instead of miniconda and so the default channel is conda-forge. So all the packages are already installed from the conda-forge channel.

Answer 18 · 2021-05-03T20:48:51.000Z

@romainx @mathbunnyru do you think #1019 (comment) would be something I could work towards?

Answer 19 · 2021-05-03T20:55:29.000Z

@consideRatio if you don't mind, I would like to take a few days to give this a thought, because I want to be more familiar with the process of building for other archs in GitHub/Docker environment.

Answer 20 · 2021-05-03T21:31:13.000Z

Docker buildx uses qemu to build images for non native architectures.

There are a few issues when building multi-arch images simultaneously under the same tag that may have an impact here. This is a good introduction to the overall process https://www.docker.com/blog/multi-arch-build-and-images-the-simple-way/

docker build can build and push multiple architectures at the same time, and takes care of creating a manifest, but it can't load the built image into the local docker host for testing. This means docker run <built-image> won't work unless you've pushed to a registry so it can be pulled straight back. This kind of makes sense, since you can't load an aarch64 image into an amd64 Docker host, but it makes testing a bit more of a pain. The alternative would be to build one architecture at a time, then create your manifest manually with docker manifest, or rerun buildx and rely on the docker layer cache.

The final issue is testing, docker buildx builds for multiple architectures, but as far as I know there's no way to run an image for a non-native platform. In JupyterHub and Z2JH we manually tested the aarch64 images before and after the PRs were merged, but they're not automatically tested- we assume that if there are no build errors and the amd64 image successfully runs then it's probably fine.

Answer 21 · 2021-05-04T11:37:49.000Z

Hello @manics I ran into the same issue in my PR #1202. I had to build the image for each architecture one by one and with nothing done in parallel, so it was a bit long. In fact it was the main blocker. However, I was able to run the unit tests on each architecture variant successfully thanks to qemu. I had even modified the tests because pandoc was not available on for each architecture.

@pytest.mark.skip_arch(["arm64", "ppc64le"])
def test_pandoc(container):
    """Pandoc shall be able to convert MD to HTML."""
    LOGGER.info(container.image_name)

@consideRatio I can confirm that the only modification made to the base-notebook Dockerfile was to remove the SHA pin on the upstream ubuntu image. Everything then can be managed through ARGS -- they have been defined on purpose.

Answer 22 · 2021-05-06T11:39:28.000Z

I would like to help this repo do the same, but at the same time, I find it crucial that future maintenance is sustainable. I created the enhancement proposal below with that in mind.

We would be kind for any help with the new archs so your help would be great.

I will comment on each element of your proposal

Principle - to maintain only a single Dockerfile per image

💯 agree

Action - to remove the dedicated and outdated ppc64le Dockerfiles and .patch files

💯 agree

Implementation detail - maintain a list of the images' compatible architectures

💯 agree

Action - update the Makefile to be able to, in an opt-in fashion, to build the same Dockerfile for all compatible architectures (like in: Z2JH, JupyterHub, ConfigurableHTTPProxy)

This is where maintenance is sustainable comes in.
I don't like how Makefile looked after #1202.
At the same time, I respect @romainx's work because it's an important thing for the community.
Removing patches will definitely look Makefile better.
But adding a lot of multiarch steps is not really good for future maintenance.

As far as I understand, we could for example have our self-hosted runners for arm.
If someone knows the pros/cons of qemu vs self-hosted runners, that would be great.

Answer 23 · 2021-05-06T16:39:12.000Z

@mathbunnyru thanks for taking the time to deliberate on my proposal!

I think the proposal I made was quite course and needs some additional exploration on the implementation - but it is very central to me that it is sustainable to maintain as you emphasis.

@romainx @mathbunnyru perhaps I could start working on a separate PR scoped just about point 1 and 2?

Regarding creating maintenance sustainable support for multiple architectures, I think I need to do some practical exploration and readup on past work to become more clearly opinionated on what I think is a good approach. But, here are some thoughts at this point.

While working towards supporting more architectures, I think we should think of it as a bonus rather than a requirement along the way.
If we can support running the full build/test suite locally it would be good, but I think running against non-amd64 architectures could be an optional stretch goal. It feels important to not let the local build/test experience to build amd64 images is disrupted.
I'm considering minimizing the Dockerfiles' ARGs and just a single one: arch, and instead of embedding a single default checksum for amd64/x86_64, we embed one checksum per supported arch.
I'm thinking that we want to be able to use docker buildx build --platform linux/amd64,linux/arm64 ... just like we use docker build ..., and that the only customization to the Makefile is one to make it capable of using the multi-platform build in an opt-in manner, perhaps controlled via a ARCH environment variable.

Answer 24 · 2021-06-01T10:19:24.000Z

@consideRatio I don't want to put any pressure on you, but how is it going? :)
Now, I'm more interested in these arm images, because I have recently switched to m1 mac, so it's not as easy development as it was, but with arm, it would probably be easy again :)

Answer 25 · 2021-07-19T11:35:51.000Z

I created myself a VM on Amazon to have much faster builds.
I found the cause of issue for scipy-notebook, created an upstream issue.

Answer 26 · 2021-07-19T12:45:32.000Z

Did you create a arm64 VM on Amazon and used the normal make build commands - and it worked successfully as a arm64 build, and it also gave you a lot better performance than using the emulator strategy? 🎉

@mathbunnyru can you provide a link to that upstream issue?

Answer 27 · 2021-07-19T12:47:30.000Z

@mathbunnyru can you provide a link to that issue?

It's in the top message of this issue.

Answer 28 · 2021-08-26T10:30:19.000Z

Everything except datascience and tensorflow images builds fine in CI, so I think we can close this issue.

Datascience notebook builds fine on a real arm machine or vm, but not in qemu.
When we switch to arm workers it will be easy to support this image.

There is no official wheel for arm tensorflow, so for now I don't think there is strong need for this image. If there will be an official wheel one day, then adding support for arm would be a one line change in Makefile.

Answer 29 · 2022-05-30T09:16:35.000Z

Hi @mathbunnyru
According to actions/runner#805 - ARM Runners are now pre-released. But it looks as if they are only available for self-hosted runners? What's the status of providing the DataScience Notebook with the arm tag?

Answer 30 · 2022-05-30T10:26:15.000Z

Hi @mathbunnyru
According to actions/runner#805 - ARM Runners are now pre-released. But it looks as if they are only available for self-hosted runners? What's the status of providing the DataScience Notebook with the arm tag?

Hi @florianbaer

First of all, for now, we don't have any self-hosted runners here and use buildx + QEMU to build arm images.
datascience-notebook actually builds fine under native aarch64 (I built it on my M1 Mac), but fails under qemu emulated environment.
I think the issue you mentioned doesn't help us. I mean, we don't specifically need macos aarch64, simple linux aarch64 would work for us and it's been available for a while, if I'm right. But still, we will probably have to have self-hosted runners.

Also, it might be worth checking again, that datascience-notebook fails under QEMU, because QEMU has recently release 7.0 version.