Container-first strategy
Opened this issue · 0 comments
I have had various issues (described later) with provisioning VMs using the existing Puppet method. If you have confidence in the Docker image,1 as it seems you do, there is a different approach you can take which I find very elegant:
- You maintain a Dockerfile, as opposed to a Puppet configuration
- This can be an incremental change: slowly remove dependencies from Puppet and bring them into Docker. For the sake of the general strategy, it doesn't matter whether you internally use Puppet; it only matters from an optimization and readability standpoint. The minimal change uses the existing Docker image. See #18 for optimization.2
- You as CS 162 build images with this Dockerfile
- You can create different tags for different use cases. For example, if you don't need rustup (the autograder almost surely doesn't), you can shave about 1.2 GB off the image!
- Your Docker build will be much faster than any student's Puppet provisioning, because yours will be cached and parallelized. It also only has to happen ~once.
- Students simply install docker/podman on their x86 machine, and pull and run the image.
- Docker commit is now possible, in case ever necessary
- For the most part, there is no state on the virtual machine besides git config, so you're not adding any duplicated work by using containers. Docker commit can be done by the student (or script) after setting up git, and then this new image should contain all necessary state for the remainder of the class session.
- Of course, you can also use volumes or bind mounts as well
- Non-x86-users can emulate at the Docker level (not recommended), or they can still run in a VM. But in this case, the VM is not a critical dependency, can be swapped with others, and is very fast and easy to provision.
- Users not running in a VM will experience much better performance
- The only issue I forsee in the entire end-to-end process is certain students (those on instructional machines) not having root access to run the daemon. I have no single solution for this use case,3 but most people will not have this issue because Docker on Mac is supposed to support most x86 images, and also direct use of QEMU is possible. It looks like the use of instructional machines is already discouraged, so this is probably not making the problem worse.
- Docker commit is now possible, in case ever necessary
I spent over a week in compute just provisioning the VM, and had various issues at various stages on various machines.4 This effort would no longer be necessary.
Footnotes
-
i.e., you think containerization is enough and you don't need a dedicated VM ↩
-
You can also keep the existing Puppet provisioner and leave the current approach as a backup, though the remainder of this document3 suggests that this is not necessary. ↩
-
162 can use their privileged account to start the Docker daemon and run the containers as non-root on behalf of the student, so that student the attaches to an existing container. 162 can run their own x86 VMs (say, within an instructional machine). Students can use a free instance on Oracle Cloud. Students can work directly on the instructional machines as they are doing now. ↩ ↩2
-
One particularly terrible issue that occurs on only some of my machines is that the provisioning makes the machine extremely slow (
ls
in home dir takes 20 seconds and new ssh connections can't be established). And the build gets stuck at the stage aftermatplotlib
. ↩