epics-containers/epics-base

MultiArchitecture support

Opened this issue · 4 comments

We will support MultiArch in the future using the approach outlined here https://docs.docker.com/build/ci/github-actions/multi-platform/#distribute-build-across-multiple-runners.

TODO: before closing this issue write it up as an architectural decision.

In the meantime I will update the naming conventions of our native linux and cross compiled containers using these notes:

  • Native images are useful for dev because you want to be able to compile and test in a devcontainer
  • So cross compilation should probably only be used if you can't dev on the target (like RTEMS)
  • We will switch to a naming convention of
    • epics-base-developer
    • for a multi-architecture container - meaning using this will get you a container that loads on your workstation and contains epics
      already built for your workstation
  • cross compile containers will have an additional part to their name with the EPICS_TARGET_ARCH e.g.
    • epics-base-rtems-beatnik-developer
    • For the moment these will all be linux-x86 host architecture BUT could also in future be multi-arch if we really had a need for that
  • so in summary:
    • the host arch is encoded in the multi-arch container
    • the target arch is encoded in the name of the container
    • thus there is one set of multi-arch containers per target arch

RIGHT NOW this means the following changes to container names:

  • epics-base-linux-developer -> epics-base-developer: multi-arch native
  • epics-base-linux-runtime -> epics-base-runtime: multi-arch native
  • epics-base-rtems-beatnik-developer: stays - the cross compile from linux x86
  • epics-base-rtems-beatnik-runtime: stays - the cross compile from linux x86
  • epics-base-linux-x86_64-developer: REMOVE (holdover from the last iteration of this decision)
  • epics-base-linux-x86_64-runtime: REMOVE (holdover from the last iteration of this decision)
  • epics-base-linux-arm-developer: REMOVE but would mean means cross compile to linux arm 64 from linux x86
  • epics-base-linux-arm-runtime: REMOVE but would now means cross compile to linux arm 64 from linux x86

@coretl @GDYendell I hope this described what we discussed?

I'm Still not that happy about doing multi-arch

There are two approaches that we can take to make multiarch:

  1. Simple: we just pass multiple platforms to the single buildx GHA. This would require logic inside the Dockerfile to convert the current buildx TARGETARCH into EPICS_HOST_ARCH and we would also need some way of iterating on the required required epics 'TARGET_ARCHITECTURE' values inside the Dockerfile. Because this dirties the Dockerfile and would run every build in a single runner I do not favour this approach.
  2. Multi-runner. Run each target in a separate runner controlled by the build matrix, like we currently do. Gather all of the built containers in build artefacts. Once all are built then use buildx create manifest to generate the multiarch container and push it.

Both approaches have a couple of downsides that I'm not happy with:

  1. To get the most common and fastest built x86_64 target or any other target available in the registry you must wait for all targets to build. This slows down the build for all targets (significantly: from 5mins to 50mins)
  2. In order to get any target built you must debug issues with every other target first.
  3. It is not clear how to debug native builds when your workstation is of a different architecture (other that by repeated CI invocations which is prohibitively slow)

Which downside goes with which option?

@GDYendell all downsides go with both options!

I've spoken with Tom on this today. We have the following conclusions:

  • linux-x86_64 is the only supported host used by CI
  • therefore our naming is correct - its always one of
    • epics-base-developer/runtime (for native compiled x86)
    • epics-base-CROSS_COMPILED_TARGET_NAME-developer/runtime for cross compiled output made in x86 host.
  • in the future we could support other hosts for devcontainers only - not for CI and therefore
    • epics-base would be multiarch to those hosts we support (but we control that so don't give this complexity to other developers)
    • you can use this epics base on a non-x86 workstation by changing EPICS_HOST_ARCH and EPICS_TARGET_ARCH in your devcontainer.json.
    • The Dockerfile for the generic IOC you are using would rebuild that IOC natively in your architecture (maybe - you might need to do some work in ibek-support for that to work - but probably with arm64 you would be good to go)
    • At this point you can compile and test the generic IOC locally - and if you wanted to you could install the relevant cross compilers from your host build for other targets.
    • When you push your changes they will be compiled using x86 host and should hopefully just work (again any caveats on this could hopefully be covered in ibek-support)

Phew - this still sounds a bit of mind warp when I write it down!