Set default OS to bottlerocket

Question

Set default OS to bottlerocket

Closed this issue 2 years ago · 11 comments

There were earlier problems with deploying bottlerocket as the default OS. These issues are thought to be dependent on bare metal plans.

I'd expect we won't have problems changing over to bottlerocket now.

Answer 1 · 2022-07-25T22:07:32.000Z

Hello! It's great to see you doing this. For any issues, feel free to reach out at bottlerocket-os/bottlerocket

Answer 2 · 2022-08-23T18:39:16.000Z

We need Bottlerocket as the OS because Ubuntu is no-longer available:
#32

Answer 3 · 2022-08-26T14:16:44.000Z

Here are the net.toml changes needed to make bottlerocket work on an m3.small.x86

          CONTENTS: |
            # Version is required, it will change as we support
            # additional settings
            version = 1

            # "eno1" is the interface name
            # Users may turn on dhcp4 and dhcp6 via boolean
            [enp1s0f0np0]
            dhcp4 = true
            dhcp6 = false
            # Define this interface as the "primary" interface
            # for the system.  This IP is what kubelet will use
            # as the node IP.  If none of the interfaces has
            # "primary" set, we choose the first interface in
            # the file
            primary = true

Key aspects are the interface name and disabling dhcp6.

Answer 4 · 2022-08-26T14:17:36.000Z

Here are the bootconfig.data changes needed to make bottlerocket send console output to our SOS consoles.

          BOOTCONFIG_CONTENTS: |
            kernel {
                console = "ttyS1,115200n8"
            }

Answer 5 · 2022-08-26T14:18:24.000Z

According to the documentation, this should be set in a TinkerbellTemplateConfig, like this:

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellTemplateConfig
metadata:
  name: ${cluster_name}
spec:
  template:
    global_timeout: 6000
    id: ""
    name: ${cluster_name}
    tasks:
    - actions:
      - environment:
          COMPRESSED: "true"
          DEST_DISK: /dev/sda
          IMG_URL: https://anywhere-assets.eks.amazonaws.com/releases/bundles/14/artifacts/raw/1-23/bottlerocket-v1.23.7-eks-d-1-23-4-eks-a-14-amd64.img.gz
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/image2disk:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
        name: stream-image
        timeout: 600
      - environment:
          CONTENTS: |
            # Version is required, it will change as we support
            # additional settings
            version = 1

            # "eno1" is the interface name
            # Users may turn on dhcp4 and dhcp6 via boolean
            [enp1s0f0np0]
            dhcp4 = true
            dhcp6 = false
            # Define this interface as the "primary" interface
            # for the system.  This IP is what kubelet will use
            # as the node IP.  If none of the interfaces has
            # "primary" set, we choose the first interface in
            # the file
            primary = true
          DEST_DISK: /dev/sda12
          DEST_PATH: /net.toml
          DIRMODE: "0755"
          FS_TYPE: ext4
          GID: "0"
          MODE: "0644"
          UID: "0"
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
        name: write-netplan
        pid: host
        timeout: 90
      - environment:
          BOOTCONFIG_CONTENTS: |
            kernel {
                console = "ttyS1,115200n8"
            }
            init {
                systemd.log_level=debug
            }
          DEST_DISK: /dev/sda12
          DEST_PATH: /bootconfig.data
          DIRMODE: "0700"
          FS_TYPE: ext4
          GID: "0"
          MODE: "0644"
          UID: "0"
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
        name: write-bootconfig
        pid: host
        timeout: 90
      - environment:
          DEST_DISK: /dev/sda12
          DEST_PATH: /user-data.toml
          DIRMODE: "0700"
          FS_TYPE: ext4
          GID: "0"
          HEGEL_URLS: http://${pool_admin}:50061,http://${tink_vip}:50061
          MODE: "0644"
          UID: "0"
        image: public.ecr.aws/eks-anywhere/tinkerbell/hub/writefile:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
        name: write-user-data
        pid: host
        timeout: 90
      - image: public.ecr.aws/eks-anywhere/tinkerbell/hub/reboot:6c0f0d437bde2c836d90b000312c8b25fa1b65e1-eks-a-14
        name: reboot-image
        pid: host
        timeout: 90
        volumes:
        - /worker:/worker
    version: "0.1"

Answer 6 · 2022-08-26T14:20:18.000Z

To use this TinkerbellTemplateConfig, you need to modify the generated my-eksa-cluster.yaml file to reference it in the TinkerbellMachineConfig sections, like this:

kind: TinkerbellMachineConfig
metadata:
  name: my-eksa-cluster-cp
spec:
  hardwareSelector:
    type: cp
  osFamily: bottlerocket
  templateRef:
    kind: TinkerbellTemplateConfig
    name: my-eksa-cluster
  users:

Answer 7 · 2022-08-26T14:21:09.000Z

Unfortunately, it seems the current version of EKS-A doesn't respect these overrides. I plan to open a bug on aws/eks-anywhere after validating this is still broken in 0.11.1.

Answer 8 · 2022-08-26T14:34:51.000Z

/cc @stockholmux

Answer 9 · 2022-08-26T15:10:47.000Z

aws/eks-anywhere#3179

Answer 10 · 2022-08-26T19:06:16.000Z

Was missing this stuff from the templateconfig file:

      name: my-eksa-cluster
      volumes:
        - /dev:/dev
        - /dev/console:/dev/console
        - /lib/firmware:/lib/firmware:ro
      worker: '{{.device_1}}'

Answer 11 · 2022-08-26T22:01:07.000Z

Fixed by #31

There are new concerns that we'll want to express as issues expressed in this comment:
#31 (comment)