bdh-generalization/requirements

Support for podman

Opened this issue · 5 comments

As this is default for RHEL and some hospitals.

Links to #6

Alternative technologies to consider: Podman, Singularity

Refactoring milestones:

  1. There are multiple containerisation alternatives to Docker, being Podman and Singularity the most popular ones. It would be nice to decouple the Node implementation from any particular containerisation technology (the red area shows how it is currently designed) through abstractions (to make Node open to extension but closed for modifications in this aspect). After this refactoring, the appropriate 'manager' would be selected at runtime.

Image

  1. There are multiple factors to consider to decide which one would be included first. One of these could be the existing Python libraries/clients for each technology API : https://pypi.org/project/podman-py/ , https://singularityhub.github.io/singularity-cli/

  2. Another factor would be in which image-registry environments the images of these technologies could be managed, and their support to features like Notary or Cosign. For instance, as far I understand Harbor does't support podman images,

Image

Good stuff! Just to add: I expect a large part of what is now the DockerManager can be generalized into a ContainerManager from which classes like DockerManager and PodmanManager would inherit, as all of these container technologies are based on containerd.

Hi all,

Based on the recent discussions around this feature, I'd to start this follow-up message by proposing to make it more precise (as two interrelated features), as follows:

  • 'Enable the use of alternative daemonless, rootless, containerization technologies for improved security.'
  • 'Provide support for alternative image formats, such as Singularity.'

In the past few weeks, I have been exploring alternatives for implementing these features, as a baseline for the discussion (your comments about them are appreciated!). I did this by (1) going through the containerized elements of vantage6 architecture (to understand them better), and (2) conducting experiments on multiple containerization platforms SDKs -along with Kubernetes- to see to what extent the containerization technology and format could be decoupled from the v6-node core.

To illustrate these alternatives, it is important to describe first how containers currently work on a vantage6 node (please let me know if I got something wrong). The left image illustrates a v6-node deployed directly on the server which, when needed, would run containerized algorithms through the docker daemon. These algorithms will eventually need to exchange data or perform requests to the v6-server through a 'proxy' http server (a server integrated on the node that encrypts and forward requests to the real one). In this configuration, the communication between the algorithm and the proxy is possible as a docker container has access to the host network by default.

On the configuration at the right -the one suggested in the v6 documentation-, the v6-node runs within a docker container. In this configuration data exchange between the algorithm and the proxy is also possible by linking both v6-node container and the algorithm container to the same virtual docker network.

Image

Alternative one

  • The v6-node would still run on Docker, or -ideally- on an alternative rootless like podman.
  • The DockerManager would be replaced with an 'ContainerManager' abstraction, which defines the operations currently needed by the v6-node (run, getresult, createvolume, etc). Concrete implementations of this interface for Docker, Podman, and Singularity would be created through the available SDKs for each platform. At runtime, a proper manager would be created based on the image format involved in the algorithm execution.

Image

Still to be explored:

  • A common virtual network for heterogeneous containers is not possible (as far as I know).

Alternative two

  • The v6-node would still run on Docker, or -ideally- on an alternative rootless like podman.
  • The DockerManager would be replaced with a generic-purpose one(for multiple containerization technologies). In this case, the compatibility/transformation features of the v6-node's containerization platform would be used when needed. For example, when a v6-node running on podman needs to run a Singularity container (SIF), the corresponding image would be first converted to an appropriate format (e.g., SIF to OCI-compliant) one. This way, only one container runtime would be needed. For other formats, compatibility would be already there (e.g., between docker and podman).

Image

Still to be explored:

  • Limitations of 'transformed' images regarding volumes management, networking, etc, are still unclear.
  • The best baseline containerization technology (the one to be used in the v6-node), in terms of compatibility with other formats (networking, volume management, etc), requires further analysis.
  • Singularity seems to be a promising one, given its broad adoption in research environments and its unique features (e.g. its flexibility when it comes to network management).

Alternative three

The Container Runtime Interface (CRI) enables Kubernetes to handle a wide variety of container runtimes. Based on this, the v6-node architecture could be adapted as a single-node K8S architecture by:

  • Setting up a lightweight (yet production-grade) K8S local setup (e.g., k3s, Kind, MicroK8s).
  • The v6-node would be deployed to a 'deployment' Pod.
  • The DockerManager would be replaced with a general-purpose, K8S-centric one. The operations (run, isrunning, getresult, etc) would be implemented based on the K8S API. For example, running a given algorithm will lead to the creation of a 'job' Pod with the corresponding image.

Image

Still to be explored:

  • Which lightweight K8S-conformant 'local' setup would be the more suitable one (production-grade, compatibility with HPC cluster OS/environment, etc)?

Alternative four

This is an alternative for setting up v6-node on an HPC infrastructure where a conventional Kubernetes cluster is already in place. This could be done if the node architecture is redesigned as proposed in No. 3.

Image

Still to be explored:

  • In alternative 3, where all the Pods are created on the same K8S node, it should be relatively easy to handle input data (also considering the defined access rights) and intermediate results through local volumes, as v6 currently does. How to make option 3 also work on multi-node cluster settings requires further analysis/discussion.
  • There is a tradeoff to be considered for alternatives 3 and 4. On the one hand, these would bring all the complexity of Kubernetes to the vantage6 platform (development, testing, and deployment could become more difficult). On the other, the process of handling container instances from a v6-node would be not only open to another image format, but its complexity may be reduced significantly (and could eventually be more robust).
  • Could this alternative be also helpful for other purposes? E.g., to simplify the access to HPC/GPU resources?

Hi @hcadavid, nice work! It's good to have this overview on paper with clarifying graphics. I think we should discuss this soon.

I have a few questions and comments:

  • Do you have any thoughts on how easy it would be to install each of the options locally in the node organizations? Do they have simple installation processes? What are IT/legal staff in institutions/hospitals are familiar with and likely to accept? In our experience it is very important that the local installation of the nodes should be as easy as possible for vantage6 to be a success. Also, updating the nodes should be easy.

  • Do you think flexibility to switch between alternatives would be feasible? I.e. if someone doesn’t want to install k8s they can go for podman. I guess this isn't easy but can you think of implementations that make this relatively easy?

  • In the intro you discuss the node running directly on the host as well as in a docker container. In practice we only run it dockerized nowadays.

Thank you @hcadavid for this research. Nice work indeed. I think @bartvanb raised some important questions for us. First of all I am most enthusiastic about the Kubernetes solution as it has a lot of pros:

  • We need only a single DockerManager to handle multiple container runtimes
  • Simplify vantage6 as networking is handled by Kuberneters making it more robust and less effort to maintain
  • More flexible in the future to extend and integrate with other platforms (like vantage6/vantage6#431)

However the mayor downside is that Kubernetes becomes a dependency. This might not be ideal. However:

  • I think that Docker is already a bit of a problem for these centers so I wonder if Kubernetes poses a bigger problem. At least for our projects I think we can push centers to use Kubernetes as well. They simply use a VM and install k8s instead Docker... ? Or if they have the luxury in a cloud environment they kan use Kubernetes resources of their provider directly
  • Kubernetes comes with Podman and Docker Desktop by default nowadays.

The alternative -as Bart suggest- is to support both cases. Which is less ideal but the only option if k8s requirement poses bigger problems than the Docker requirements.

I do have some open questions regarding the features of vantage6:

  • How about the VPN feature which requires changes on the host machine
  • How about whitelisting (probably doable) and ssh tunneling (probably impossible? but we might use a better network solution from Kubernetes)
  • It might be good to implement vantage6/vantage6#154 together with this, although Kubernetes does support some vorm of volumes @lsago (?)

I think @bartvanb and I will investigate a bit if we can overcome the Kubernetes requirements.