kubernetes-sigs/image-builder

Speed up Flatcar QEMU builds

invidian opened this issue · 13 comments

Is your feature request related to a problem? Please describe.

Right now Flatcar QEMU build downloads ISO image via Packer, then inside the VM image is downloaded again by Flatcar installation script to install Flatcar to the disk and reboot. This is disk and network intensive and could possibly be improved.

Describe the solution you'd like

Perhaps it would be possible to use "disk_image" and "qemuargs" Packer QEMU builder options to pass packer Ignition directly to the Packer QEMU builder to skip one level of indirection and get a VM directly booted with SSH available.

One blocker for this might be that IIRC right now Flatcar does not offer uncompressed QEMU disk images, so we would either have to make them available or download and unpack the image manually before we run Packer. (EDIT: reported flatcar/Flatcar#791)

Here is the PoC solution which seem to work:

diff --git images/capi/packer/qemu/packer.json images/capi/packer/qemu/packer.json
index 5c626e9c0..c30828f56 100644
--- images/capi/packer/qemu/packer.json
+++ images/capi/packer/qemu/packer.json
@@ -28,7 +28,13 @@
       "ssh_timeout": "2h",
       "ssh_username": "{{user `ssh_username`}}",
       "type": "qemu",
-      "vm_name": "{{user `build_name`}}-kube-{{user `kubernetes_semver`}}"
+      "vm_name": "{{user `build_name`}}-kube-{{user `kubernetes_semver`}}",
+      "disk_image": "{{user `disk_image`}}",
+      "qemuargs": [
+        [
+          "-fw_cfg", "name=opt/org.flatcar-linux/config,file=/<absolute path to image builder repo>/image-builder/images/capi/packer/qemu/flatcar/ignition-builder.json"
+        ]
+      ]
     }
   ],
   "post-processors": [
@@ -165,6 +171,7 @@
     "python_path": "",
     "qemu_binary": "qemu-system-x86_64",
     "ssh_password": "builder",
-    "ssh_username": "builder"
+    "ssh_username": "builder",
+    "disk_image": "false"
   }
 }
diff --git images/capi/packer/qemu/qemu-flatcar.json images/capi/packer/qemu/qemu-flatcar.json
index bb10cccce..120456f84 100644
--- images/capi/packer/qemu/qemu-flatcar.json
+++ images/capi/packer/qemu/qemu-flatcar.json
@@ -1,6 +1,5 @@
 {
   "ansible_extra_vars": "ansible_python_interpreter=/opt/bin/python",
-  "boot_command_prefix": "sudo systemctl mask sshd.socket --now<enter>curl -sLo /tmp/ignition.json https://raw.githubusercontent.com/flatcar-linux/flatcar-packer-qemu/917f209e1afd262e71f41c65c1295f29c08cb8c6/ignition-builder.json<enter>sudo flatcar-install -d /dev/sda -C {{user `channel_name`}} -V {{user `release_version`}} -i /tmp/ignition.json<enter>sudo reboot<enter>",
   "boot_media_path": "",
   "boot_wait": "120s",
   "build_name": "flatcar-{{env `FLATCAR_CHANNEL`}}-{{env `FLATCAR_VERSION`}}",
@@ -9,9 +8,9 @@
   "distro_name": "flatcar",
   "guest_os_type": "linux-64",
   "http_directory": "",
-  "iso_checksum": "https://{{env `FLATCAR_CHANNEL`}}.release.flatcar-linux.net/amd64-usr/{{env `FLATCAR_VERSION`}}/flatcar_production_iso_image.iso.DIGESTS.asc",
-  "iso_checksum_type": "file",
-  "iso_url": "https://{{env `FLATCAR_CHANNEL`}}.release.flatcar-linux.net/amd64-usr/{{env `FLATCAR_VERSION`}}/flatcar_production_iso_image.iso",
+  "iso_checksum": "e0250408f3f5fbe3e6dca5a88bef0dc9f6bb3dc8f4a16f7ecf0ab7d775ac42a2",
+  "iso_checksum_type": "sha256",
+  "iso_url": "file:///<absolute path to image>/flatcar_production_qemu_image.img",
   "kubernetes_cni_source_type": "http",
   "kubernetes_source_type": "http",
   "os_display_name": "Flatcar Container Linux ({{env `FLATCAR_CHANNEL`}} channel release {{env `FLATCAR_VERSION`}})",
@@ -20,5 +19,6 @@
   "shutdown_command": "shutdown -P now",
   "systemd_prefix": "/etc/systemd",
   "sysusr_prefix": "/opt",
-  "sysusrlocal_prefix": "/opt"
+  "sysusrlocal_prefix": "/opt",
+  "disk_image": "true"
 }

Additional context

If that happens, perhaps it should be contributed to https://github.com/flatcar-linux/flatcar-packer-qemu.

/kind feature

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

/remove-lifecycle stale

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

/remove-lifecycle rotten

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

/remove-lifecycle stale

@tormath1 Thoughts?

@mdbooth that would be nice to see this - especially if we build Flatcar images without KVM in Github Actions. This issue is already tracked as part of the Cluster API Flatcar Roadmap (https://github.com/orgs/flatcar/projects/7/views/14) but it's not prioritized.