Failed with docker-compose sandbox
Closed this issue · 4 comments
I want to provision a few bare-metal HP desktop machines with tinkerbell.
Expected Behaviour
I follow the steps in the docker-compose sandbox, and will afterwards have a machine with a provisioned OS (Ubuntu) on it.
https://github.com/tinkerbell/sandbox/blob/main/docs/quickstarts/COMPOSE.md
Current Behaviour
All containers are started, exactly as in the COMPOSE.md expected-output-step-4.
After that, nothing happens.
Possible Solution
- please describe some ideas how to debug or where to look for potential configuration errors.
- I have a few years of sw-development experience, basic networking knowledge, now my way around Docker.
Even with this background, I saw no chance to get any further. Although Tinkerbell really looks nice from a birds' eye perspective,
Steps to Reproduce (for bugs)
In the hardware.yaml, certain values remain unexplained:
$TINKERBELL_CLIENT_GW
: is this the gateway to the "outer" network?- How shall I know the TINKERBELL_CLIENT_IP, the (Tink-) DHCP-server is supposed to create one for that machine...
- metadata, facility?
- Can I modify the hardware.yaml while the Tink-containers are running? How do I add a second machine?
All in all, the documentation left too many questions open for me. Sorry to say that.
Context
I wanted to try out the most basic step, install Ubuntu on a bare metal machine. Failed.
Your Environment
docker-compose running on MacOS (the provisioner).
5-port switch connected over USB to the provisioner. Provisioner connected over a second ethernet to DSL router/internet.
HP-EliteDesk mini PCs connected to the switch. Boot-order (in BIOS) set to 1.Netboot, 2.USB, 3. SSD
DHCP-scan on HP starts, but Tinkerbell does not provide an IP adress to the client.
-
How are you running Tinkerbell? Using Vagrant & VirtualBox, Vagrant & Libvirt, on Packet using Terraform, or give details:
-
Link to your project or a code example to reproduce issue:
apiVersion: "tinkerbell.org/v1alpha1"
kind: Hardware
metadata:
name: machine1
spec:
disks:
- device: $DISK_DEVICE
metadata:
facility:
facility_code: sandbox
instance:
hostname: "machine1"
id: "$TINKERBELL_CLIENT_MAC"
operating_system:
distro: "ubuntu"
os_slug: "ubuntu_20_04"
version: "20.04"
interfaces:
- dhcp:
arch: x86_64
hostname: machine1
ip:
address: 192.168.178.230
gateway: $TINKERBELL_CLIENT_GW
netmask: 255.255.255.0
lease_time: 86400
# mac: "FC:3F:DB:05:70:CD"
mac: "EC:8E:B5:77:9E:2D"
name_servers: [],
uefi: false
netboot:
allowPXE: true
allowWorkflow: true
Hey @gernotstarke, thank you for trying out Tinkerbell and posting your experience here! On the surface the config all looks like it should work properly. Docker on MacOS is actually not supported though. The way Docker works on MacOS means that we won't be able to see broadcast traffic on the network. You will need a machine that has a network interface on the same layer 2 as the machine(s) to provision.
I can think of a few possible options here. You could create VM on your MacOS with a bridged network interface to your layer 2. Then setup the Tinkerbell stack on there. If you have another machine that is directly connected to the same layer 2, you could use that.
Again, thanks for trying Tinkerbell out! Apologies for the lack of and out dated documentation. We definitely have some very large gaps that need filled. We are trying our best to get our documentation updated. Thanks for your patience!
thanx @jacobweinstock for your answer.
Same situation here, tried the compose stack, and when running KUBECONFIG=./state/kube/kubeconfig.yaml kubectl get -n tink-system workflow sandbox-workflow --watch
the tink-system
doesn't exist, and also the machine can't boot at all.
I'm using Debian 11 with the latest Docker installed as well kubectl
and minikube
. Will try the Vagrant version to see if get any luck.
Edit: Even when reproducing Vagrant example, I got sh: 1: tink: not found
when running the Postgres background command to see the workflow outputs. Tink seems to not be installed as well.
Hello,
same behaviour as @pedroalvesbatista for me.
Provided an ubuntu 22-04 vm (vcenter) , with git / docker / docker-compose /kubectl installed .
Netflow ok for dhcp / pxe when trying to bootstrap bare metal HP'server . the client ip has been allocated as i'm able to ping after 10 /15 mnts the new "host"
But no more activities ...tried to reboot the host ..but provisionning phases restarted ..in loop
Just the 3 mandatory vars have been provided : TINKERBELL_CLIENT_IP / TINKERBELL_CLIENT_MAC / TINKERBELL_HOST_IP
I also tested to fill the hardware.yaml with the previous vars and $TINKERBELL_CLIENT_GW and resolvers used on my side.
do not have namespace tink-system in order to troubleshoot .
sandbox/deploy/stack/compose# KUBECONFIG=./state/kube/kubeconfig.yaml kubectl get ns
NAME STATUS AGE
default Active 2d
kube-system Active 2d
kube-public Active 2d
kube-node-lease Active 2d
sandbox/deploy/stack/compose# KUBECONFIG=./state/kube/kubeconfig.yaml kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-b96499967-jmppg 1/1 Terminating 0 2d
kube-system coredns-b96499967-cgcqk 1/1 Running 0 2d
As a suggestion maybe useful to advice people to setup kubectl ( with a version supported by the kubernetes deployed ) ?
as @gernotstarke : the step 4 of the https://github.com/tinkerbell/sandbox/blob/main/docs/quickstarts/COMPOSE.md
is different on my side
Creating network "compose_default" with the default driver
Creating compose_k3s_1 ... done
Creating compose_fetch-osie_1 ... done
Creating compose_fetch-and-convert-ubuntu-img_1 ... done
Creating compose_manifest-update_1 ... done
Creating compose_web-assets-server_1 ... done
Creating compose_rufio-crds-apply_1 ... done
Creating compose_tink-crds-apply_1 ... done
Creating compose_rufio_1 ... done
Creating compose_tink-server_1 ... done
Creating compose_tink-controller_1 ... done
Creating compose_hegel_1 ... done
Creating compose_manifest-apply_1 ... done
Creating compose_boots_1 ... done
Feel free to ask for more details if needed !
Thanks for your job anyway !