Not able to connect to Firecracker VM from the pod

Question

Not able to connect to Firecracker VM from the pod

Closed this issue 7 months ago · 10 comments

I have a deployment file which is meant to boot a firecracker VM on the pod. I have installed python and pip inside the ex4 filesystem. I want to execute the python script from the host / pod (either through CLI or inside yaml specification). The pod starts running, the curl requests are successful.

Here is a sample deployment file i am using:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cnn-fc
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cnn-fc
  template:
    metadata:
      labels:
        app: cnn-fc
    spec:    
      hostPID: true  # Required for accessing /dev/kvm
      containers:
        - name: cnn-fc
          image: localhost:5000/ubuntu22.04-updated:latest
          imagePullPolicy: IfNotPresent
          securityContext:
            privileged: true  # Firecracker needs KVM access          
          command: ["/bin/bash", "-c"]
          args:
            - |
              echo "-------- Starting Firecracker VM --------"
              SOCKET_PATH="/run/firecracker-${POD_NAME}.sock"
              rm -f $SOCKET_PATH  # Ensure no stale socket
              /usr/local/bin/firecracker --api-sock $SOCKET_PATH > /var/lib/firecracker.log 2>&1 &
              FC_PID=$!
              echo "Firecracker started with PID: $FC_PID on socket: $SOCKET_PATH"              

              # echo "-------- Checking Firecracker process --------"
              # ps aux | grep firecracker

              # Configure Firecracker VM
              echo "------- Configuring Firecracker boot source --------"
              curl --unix-socket $SOCKET_PATH -X PUT "http://localhost/boot-source" \
              -H "Content-Type: application/json" \
              -d '{
                      "kernel_image_path": "/var/lib/firecracker-containerd/runtime/hello-vmlinux.bin",
                      "boot_args": "console=ttyS0 reboot=k panic=1 pci=off selinux=0 quiet loglevel=0"
                  }' 

              

              # Attach root filesystem
              echo "-------- Attaching root filesystem --------"
              curl --unix-socket $SOCKET_PATH -X PUT "http://localhost/drives/rootfs" \
              -H "Content-Type: application/json" \
              -d '{
                      "drive_id": "rootfs",
                      "path_on_host": "/var/lib/firecracker-containerd/runtime/ubuntu-24.04.ext4",
                      "is_root_device": true,
                      "is_read_only": true
                  }'             

              echo "Root filesystem attached successfully!"             

              # Start VM
              echo "-------- Starting Firecracker VM --------"
              curl --unix-socket $SOCKET_PATH -X PUT "http://localhost/actions" \
              -H "Content-Type: application/json" \
              -d '{
                      "action_type": "InstanceStart"
                  }'
             
              echo "Firecracker VM started successfully!"

              tail -f /dev/null

          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name  # Unique socket per pod
          volumeMounts:
            - name: firecracker-socket
              mountPath: /run
            - name: firecracker-binary
              mountPath: /usr/local/bin/firecracker  
            - name: firecracker-images
              mountPath: /var/lib/firecracker-containerd/runtime  
            
      volumes:
        - name: firecracker-socket
          hostPath:
            path: /run
            type: Directory
        - name: firecracker-binary
          hostPath:
            path: /usr/local/bin/firecracker  # Firecracker binary on host
            type: File
        - name: firecracker-images
          hostPath:
            path: /var/lib/firecracker-containerd/runtime  # Kernel & RootFS images
            type: Directory

Please let me know what other information you need from my side. Please help.

Answer 1 · 2025-03-26T14:46:47.000Z

Hey @anubhavjana,

thanks for submitting this report. It looks like you're using Kubernetes in your setup to orchestrate Firecracker uVMs. We only provide support for Firecracker in this repository, and I'm personally not very familiar with Kubernetes. If you could reproduce the issue on an environment with Firecracker only that'd be ideal. Alternatively, please provide a detailed explanation of your complete setup.

In any case, I took a look and noticed you didn't set up any networking interface in Firecracker, nor a vsock. How is the communication between Kubernetes and the uVM supposed to work?

Furthermore, if you could provide some logs from your code, firecracker, and the uVM serial console, it would be helpful to understand what's happening.

Thanks,
Riccardo

Answer 2 · 2025-03-27T07:12:35.000Z

Hi @Manciukic , thanks for getting back. So my main motive is to have a firecracker pod up and running on kubernetes and then execute python script inside it. For this, I have already installed pip3 and python3 in the .ext4 filesystem that I am using.

This is the portion of the firecracker setup inside the yaml definition (you can consider that these are the steps that I would have also used on the bare metal host)

echo "--------Starting Firecracker VM--------"
              TAP_DEV="tap0"
              TAP_IP="172.16.0.1"
              MASK_SHORT="/30"

              # Setup network interface
              ip link del "$TAP_DEV" 2> /dev/null || true
              ip tuntap add dev "$TAP_DEV" mode tap
              ip addr add "${TAP_IP}${MASK_SHORT}" dev "$TAP_DEV"
              ip link set dev "$TAP_DEV" up

              # Enable IP forwarding
              echo 1 > /proc/sys/net/ipv4/ip_forward
              iptables -P FORWARD ACCEPT

              # Identify the host's primary network interface
              HOST_IFACE=$(ip -j route list default | jq -r '.[0].dev')
              echo "----------- Host Interface: $HOST_IFACE --------------"

              # Flush and reapply NAT rules
              iptables -t nat -D POSTROUTING -o "$HOST_IFACE" -j MASQUERADE 2>/dev/null || true
              iptables -t nat -A POSTROUTING -o "$HOST_IFACE" -j MASQUERADE
              iptables -A FORWARD -i "$TAP_DEV" -o "$HOST_IFACE" -j ACCEPT
              iptables -A FORWARD -i "$HOST_IFACE" -o "$TAP_DEV" -m state --state RELATED,ESTABLISHED -j ACCEPT

              SOCKET_PATH="/run/firecracker-${POD_NAME}.sock"
              rm -f $SOCKET_PATH  # Ensure no stale socket
              /usr/local/bin/firecracker --api-sock $SOCKET_PATH > /var/lib/firecracker.log 2>&1 &

              FC_PID=$!
              echo "-------- Firecracker started with PID: $FC_PID on socket: $SOCKET_PATH --------"

              # Configure Firecracker VM
              curl --unix-socket $SOCKET_PATH -X PUT "http://localhost/boot-source" \
                -H "Content-Type: application/json" \
                -d '{
                      "kernel_image_path": "/var/lib/firecracker-containerd/runtime/hello-vmlinux.bin",
                      "boot_args": "console=ttyS0 reboot=k panic=1 pci=off nomodules selinux=0 quiet loglevel=0"
                  }'

              echo "-------- Firecracker configured with kernel_image_path AND boot_args --------"

              # Attach root filesystem
              curl --unix-socket $SOCKET_PATH -X PUT "http://localhost/drives/rootfs" \
                -H "Content-Type: application/json" \
                -d '{
                      "drive_id": "rootfs",
                      "path_on_host": "/var/lib/firecracker-containerd/runtime/ubuntu-24.04.ext4",
                      "is_root_device": true,
                      "is_read_only": false
                  }'
              echo "-------- Attached root filesystem successfully --------"

              # Attach writable volume for output
              curl --unix-socket $SOCKET_PATH -X PUT "http://localhost/drives/output" \
                -H "Content-Type: application/json" \
                -d '{
                      "drive_id": "output",
                      "path_on_host": "/var/lib/firecracker-containerd/runtime/linpack-fc-output.txt",
                      "is_root_device": false,
                      "is_read_only": false
                  }'

              echo "-------- Attached writable volume for output successfully --------"
              FC_MAC="06:00:AC:10:00:02"

              # Set network interface
              curl --unix-socket $SOCKET_PATH \
              -X PUT 'http://localhost/network-interfaces/eth0' \
              -H 'Accept: application/json' \
              -H 'Content-Type: application/json' \
              -d '{
                  "iface_id": "eth0",
                  "guest_mac": "06:00:AC:10:00:02",
                  "host_dev_name": "tap0"
                }'

              

              sleep 0.1s
              # Start VM
              curl --unix-socket $SOCKET_PATH -X PUT "http://localhost/actions" \
                -H "Content-Type: application/json" \
                -d '{
                      "action_type": "InstanceStart"
                  }'

              echo "--------Firecracker VM started!--------"
              sleep 2s  # Increase wait time for VM boot

On deploying it, here are the logs:

Here is the output of ps aux from host.

 curl --unix-socket /run/firecracker-fc-test-5bb57cc476-cwqzm.sock -X GET "http://localhost/machine-config"

{"vcpu_count":1,"mem_size_mib":128,"smt":false,"track_dirty_pages":false}

Ran the following from the pod container where FC is supposed to boot up.

ip a show tap0
iptables -t nat -L -v

3: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 96:47:d9:d2:db:17 brd ff:ff:ff:ff:ff:ff
    inet 172.16.0.1/30 scope global tap0
       valid_lft forever preferred_lft forever
    inet6 fe80::9447:d9ff:fed2:db17/64 scope link 
       valid_lft forever preferred_lft forever
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 MASQUERADE  all  --  any    eth0    anywhere             anywhere

I have run the following command to setup tap device on the host

sudo ip tuntap add dev tap0 mode tap
sudo ip link set tap0 up
sudo ip addr add 192.168.100.1/24 dev tap0

Finally this is the firecracker boot log inside the pod container

So, could you please these logs and help me with how can I run a python script inside the microVM from outside the VM, say host. @Manciukic

Answer 3 · 2025-03-31T06:04:19.000Z

Hi @Manciukic @ShadowCurse - can i get some support on this ? This is a little bit on priority.

Answer 4 · 2025-04-01T11:08:21.000Z

How are you configuring the networking inside of the guest?
When using our CI images, there's a fcnet.service (code) systemd service that sets it up depending on the physical address. But if you're not using that you may be missing the network configuration from within the guest.
If you have access to the serial console you can try to check if networking is configured by running ip a.

One simple configuration is to use the ip= kernel boot_args option. E.g. ip=172.16.0.2::172.16.0.1:255.255.255.252::eth0:off.

Also note that the code you're using to configure host networking seems to be taken from our getting started guide and that would only work for one VM per network namespace (which I think you have already as it's running inside a kubernetes container).

Answer 5 · 2025-04-02T08:00:24.000Z

@Manciukic - Following is my configuration for FC:

TAP_DEV="tap0"
              TAP_IP="172.16.0.1"
              MASK_SHORT="/30"

              # Setup network interface
              ip link del "$TAP_DEV" 2> /dev/null || true
              ip tuntap add dev "$TAP_DEV" mode tap
              ip addr add "${TAP_IP}${MASK_SHORT}" dev "$TAP_DEV"
              ip link set dev "$TAP_DEV" up

              # Enable IP forwarding
              echo 1 > /proc/sys/net/ipv4/ip_forward
              iptables -P FORWARD ACCEPT

              # Identify the host's primary network interface
              HOST_IFACE=$(ip -j route list default | jq -r '.[0].dev')
              echo "----------- Host Interface: $HOST_IFACE --------------"

              # Flush and reapply NAT rules
              iptables -t nat -D POSTROUTING -o "$HOST_IFACE" -j MASQUERADE 2>/dev/null || true
              iptables -t nat -A POSTROUTING -o "$HOST_IFACE" -j MASQUERADE
              iptables -A FORWARD -i "$TAP_DEV" -o "$HOST_IFACE" -j ACCEPT
              iptables -A FORWARD -i "$HOST_IFACE" -o "$TAP_DEV" -m state --state RELATED,ESTABLISHED -j ACCEPT
              iptables -A FORWARD -i "$TAP_DEV" -p tcp --dport 22 -j ACCEPT
              iptables -A FORWARD -o "$TAP_DEV" -p tcp --sport 22 -j ACCEPT

              SOCKET_PATH="/run/firecracker-${POD_NAME}.sock"
              rm -f $SOCKET_PATH  # Ensure no stale socket
              /usr/local/bin/firecracker --api-sock $SOCKET_PATH > /var/lib/firecracker.log 2>&1 &

              FC_PID=$!
              echo "-------- Firecracker started with PID: $FC_PID on socket: $SOCKET_PATH --------"

              # Configure Firecracker VM
              curl --unix-socket $SOCKET_PATH -X PUT "http://localhost/boot-source" \
                -H "Content-Type: application/json" \
                -d '{
                      "kernel_image_path": "/var/lib/firecracker-containerd/runtime/hello-vmlinux.bin",
                      "boot_args": "console=ttyS0 reboot=k panic=1 pci=off nomodules selinux=0 quiet loglevel=0 systemd.mask=systemd-resolved.service systemd.mask=systemd-random-seed.service"
                  }'

              echo "-------- Firecracker configured with kernel_image_path AND boot_args --------"

              # Attach root filesystem
              curl --unix-socket $SOCKET_PATH -X PUT "http://localhost/drives/rootfs" \
                -H "Content-Type: application/json" \
                -d '{
                      "drive_id": "rootfs",
                      "path_on_host": "/var/lib/firecracker-containerd/runtime/ubuntu-24.04.ext4",
                      "is_root_device": true,
                      "is_read_only": false
                  }'
              echo "-------- Attached root filesystem successfully --------"

              # Attach writable volume for output
              curl --unix-socket $SOCKET_PATH -X PUT "http://localhost/drives/output" \
                -H "Content-Type: application/json" \
                -d '{
                      "drive_id": "output",
                      "path_on_host": "/var/lib/firecracker-containerd/runtime/linpack-fc-output.txt",
                      "is_root_device": false,
                      "is_read_only": false
                  }'

              echo "-------- Attached writable volume for output successfully --------"
              FC_MAC="06:00:AC:10:00:02"

              # Set network interface
              curl --unix-socket $SOCKET_PATH \
              -X PUT 'http://localhost/network-interfaces/eth0' \
              -H 'Accept: application/json' \
              -H 'Content-Type: application/json' \
              -d '{
                  "iface_id": "eth0",
                  "guest_mac": "06:00:AC:10:00:02",
                  "host_dev_name": "tap0"
                }'
             

              sleep 0.1s
              # Start VM
              curl --unix-socket $SOCKET_PATH -X PUT "http://localhost/actions" \
                -H "Content-Type: application/json" \
                -d '{
                      "action_type": "InstanceStart"
                  }'

For once, I could ssh into the VM from my pod by the following command: ssh -o StrictHostKeyChecking=no -i /var/lib/firecracker-containerd/runtime/ubuntu-24.04.id_rsa root@172.16.0.2

on the pod : ip neigh show --> 172.16.0.2 dev tap0 lladdr 06:00:ac:10:00:02 REACHABLE

I could even run a python script inside the VM:

But, when I am re-deploying the yaml, my ssh is timing out , with the same command. After long time, it is getting assigned an IP and only after a long time, i am able to get the ip. Why is that happening?
Also, since i already have the necessary scripts inside the filesystem as shown, is there any way i can execute the python script from outside the VM?

Thanks. Looking forward for the response. @Manciukic

Answer 6 · 2025-04-02T08:41:26.000Z

This looks like a VM guest configuration issue rather than a Firecracker problem.

It's still not clear to me how is the VM getting the IP address assigned in your setup. You could try to check the serial console output from Firecracker to see when and how it's getting assigned.

Regarding running python, you need a way to communicate with the VM to start the executable and read the output. That could be ssh over a network connection, the serial console, a daemon running in the guest listening on a vsock or network port, etc. The best solution will depend on your particular use-case and requirements.

Also depending on your use-case, there may be better ways to run a Firecracker VM inside a kubernetes pod, like kata containers, and firecracker-containerd, to name a few.

Answer 7 · 2025-04-02T08:57:33.000Z

"It's still not clear to me how is the VM getting the IP address assigned in your setup. You could try to check the serial console output from Firecracker to see when and how it's getting assigned." -- it is from the tap device. The tap0 in pod (which acts as the host) is as 172.16.0.1 and since /30 is used, the next address 172.1.0.2 is assigned to the VM.

Particular use case is: The VM should be booted up and running - load generator (from a different machine) will keep sending requests to the VM to execute the script in the VM>

Yes, regarding kata-containers approach - can you help me with pointers of how i can use kata-container to do this?

Answer 8 · 2025-04-02T09:00:08.000Z

Also @Manciukic - if i am using kata container - how do i make sure python3, pip3 packages are installed and present in the VM? Current approach is using a ubuntu24.04 ext4 filesystem where I have installed all these.

Answer 9 · 2025-04-02T09:42:15.000Z

The tap0 in pod (which acts as the host) is as 172.16.0.1 and since /30 is used, the next address 172.1.0.2 is assigned to the VM.

How does the guest know it needs to use 172.16.0.2/30 and that 172.16.0.1 is the gateway? Is it using fcnet.service, a static IP, DHCP, or something else?

can you help me with pointers of how i can use kata-container to do this?

You can look at their official guide https://github.com/kata-containers/kata-containers/blob/main/docs/how-to/how-to-use-kata-containers-with-firecracker.md
Or you may look at firecracker-containerd: https://github.com/firecracker-microvm/firecracker-containerd/blob/main/docs/getting-started.md
I have never used them myself, but they allow to create a FC uVM like you would create a docker container from a container image.

how do i make sure python3, pip3 packages are installed and present in the VM? Current approach is using a ubuntu24.04 ext4 filesystem where I have installed all these.

They are both container runtimes and they will run the specified container image, like Docker.

Answer 10 · 2025-04-02T09:45:37.000Z

I'm converting this thread to a discussion as, as mentioned above, this is not a FC issue and it's related to the use of it from within Kubernetes.