Tanzu Lab

Collection of tools to run a small Tanzu cluster locally. The cluster is modeled off this blog.

Warning: I couldn't get Workload Management enabled with this configuration. The vSphere UI feedback for Workload Management is terrible and offers no feedback to what's going on. I ultimately found that ~20 or so pods are failing to start on the Supervisor VMs, but there's literally 0 documentation, logs, or any help at all from VMWare on how to debug and troubleshoot their incredibly opaque deployment process.

Build

  1. Build the base ESXi VM:

    packer build .
    
  2. Deploy ESXi host(s):

    vagrant up
    
  3. Generate hosts file for client VM, so it knows the IP and name of the ESXi host(s):

    ./create-client-hosts.sh
    
  4. Bring up client machine:

    vagrant up client
    

    This will also:

    • Configure certs and DNS so the client VM can interact with ESXi host(s)
    • Deploy a PhotonOS VM to the first ESXi host. This VM will be configured to be a router.
  5. Login and bring up Web Console for ESXi host:

    vagrant ssh client
    launchUI
    
  6. Login to the UI as root and Rootpass1!.

  7. Open a VM console to the router VM and login (same credentials).

  8. Run the configure script:

    /tmp/scripts/configure-photon-router.sh
    
  9. Deploy vCenter:

    /tmp/scripts/deploy-vcsa.sh
    
  10. Configure vCenter with requirements for Tanzu deployment

    /tmp/scripts/configure-vcsa.sh

  11. Update client to use vSphere connection for govc instead of ESXi

    echo "export GOVC_URL='https://administrator@vsphere.local:Rootpass1!@vcsa.esxi.test'" | sudo tee -a /etc/profile.d/govc.sh
    
  12. After creating cluster, upgrade VM compatibility of vCLS VM via ESXI host

    vcls=$(govc find vm -name vCLS-*)
    govc vm.upgrade -vm $vcls
    
  13. In vSphere UI,

    • For vCLS VM, disable EVC, then power on
    • Suppress no management network redundancy warning by adding das.ignoreRedundantNetWarning=true to Cluster Settings
    • Enable DRS and HA
    • For HA, disable host failover (we only have one)
  14. Update router VM to use additional networks:

    govc vm.network.add -vm router.esxi.test -net VDS/Workload
    govc vm.network.add -vm router.esxi.test -net VDS/Frontend
    
  15. Add static routes to allow client machine to access additional networks:

    sudo nmcli connection modify eth0 +ipv4.routes "10.10.0.0/24 192.168.122.2"
    sudo nmcli connection modify eth0 +ipv4.routes "10.20.0.0/24 192.168.122.2"
    # Verify the IPs can be pinged now
    ping 10.10.0.1
    ping 10.20.0.1
    
  16. Back in the client VM, deploy the HA proxy VM:

    /tmp/scripts/deploy-ha.sh
    
  17. After it's been deployed and turned on, configure the HA proxy VM

    /tmp/scripts/configure-ha.sh
    
  18. In the vSphere UI, under the new haproxy.esxi.test VM, update the vApp settings:

    network.frontend_ip = 10.10.0.2/24
    network.frontend_gateway = 10.10.0.1
    
  19. Turn on the HAProxy VM

  20. Perform ping checks to ensure network connectivity between machines. From the client VM:

    /tmp/scripts/ping.sh
    
  21. Setup the Content Library for required Tanzu items:

    govc library.create -sub https://wp-content.vmware.com/v2/latest/lib.json -thumbprint 50:ff:be:b6:a4:89:60:82:65:63:00:5e:f8:6f:9c:e9:ca:6d:50:e6 Tanzu-Library
    
  22. Finally, enable Workload Management:

    - I used the GUI. I'll note weird things that came up:
      - Need the **HAProxy Management TLS Certificate** but that was generated by the HAproxy appliance in /etc/haproxy.