hashicorp/vagrant-vmware-desktop

vagrant not responding address of VM

Yaminyam opened this issue · 6 comments

Debug output

It waits infinitely at 'Waiting for the VM to receive an address...'.
But actually the vm is already running.
So, if vagrant is terminated by ctrl+c and run again, the node is already running, so it proceeds to the next step.
Therefore, if you look at the vagrantfile below, a total of 3 nodes are running, so you need to run vagrant 4 times in order to exit normally.
image
image

Expected behavior

vagrant should proceed to the next step.

Actual behavior

Endless waiting at 'Waiting for the VM to receive an address...'.

Reproduction information

Vagrant version

2.3.4

Host operating system

MacOS Ventura 13.0

Guest operating system

ubuntu22.04

Steps to reproduce

Vagrantfile

NUM_WORKER_NODES=2
IP_NW="10.0.0."
IP_START=10

Vagrant.configure("2") do |config|
    config.vm.provision "shell", inline: <<-SHELL
        apt-get update -y
        echo "$IP_NW$((IP_START))  master-node" >> /etc/hosts
        echo "$IP_NW$((IP_START+1))  worker-node01" >> /etc/hosts
        echo "$IP_NW$((IP_START+2))  worker-node02" >> /etc/hosts
    SHELL
    config.vm.box = "martyt/ubuntu2204server-arm"
    config.vm.box_check_update = true
    config.vm.provider :vmware_desktop do |vmware|
      vmware.allowlist_verified = true
      vmware.vmx["ethernet0.pcislotnumber"] = "160"
    end
    config.vm.define "master" do |master|
      master.vm.hostname = "master-node"
      master.vm.network "private_network", ip: IP_NW + "#{IP_START}"
      master.vm.provider "vmware_desktop" do |vb|
          vb.gui = true
          vb.memory = 4048
          vb.cpus = 2
          vb.vmx["ethernet0.pcislotnumber"] = "160"
      end
      master.vm.provision "shell", path: "scripts/common.sh"
      master.vm.provision "shell", path: "scripts/master.sh"
    end

    (1..NUM_WORKER_NODES).each do |i|
      config.vm.define "node0#{i}" do |node|
        node.vm.hostname = "worker-node0#{i}"
        node.vm.network "private_network", ip: IP_NW + "#{IP_START + i}"
        node.vm.provider "vmware_desktop" do |vb|
            vb.gui = true
            vb.memory = 2048
            vb.cpus = 1
            vb.vmx["ethernet0.pcislotnumber"] = "160"
        end
        node.vm.provision "shell", path: "scripts/common.sh"
        node.vm.provision "shell", path: "scripts/node.sh"
      end
    end
  end
gkb commented

I might have the same exact issue.

The debug logs for Vagrant show an endless run of the following operation.

vmrun getGuestIPAddress

I still haven't figured out why a perfectly functional virtual machine abruptly started misbehaving. Perhaps it's a change in VMWare Fusion or an update of macos; I happen to receive beta updates of macos Ventura.

Here's the full dump of the debug log.

gkb commented

It looks like the machine fails to boot at all.
It's stuck at 'EFI stub: Exiting boot services and installing virtual address map...'. So this may not be an issue with vagrant-vmware-desktop but with the virtualization itself and its interaction with the kernel. This thread on VMWare Fusion discussions has more detail.

Update-I continue to get the same error with a 6.x kernel which fixed the problem related to booting referred to above. So although machines that don't boot could also cause the problem, solving it still leaves me with IP lookup failure for which I included the debug logs.

gkb commented

The temporary workaround is to set ethernet0.virtualdev to vmxnet3 explicitly. This gist led me to it.

Hi there,

I believe this issue may have been resolved by #60 and is available in the latest release. If you upgrade to the latest vagrant-vmware-desktop plugin, do you still experience the same behavior?

gkb commented

I added a comment on the commit. On my machine, the latest version of the plugin didn't seem to work.

Update: I'm trying to get the debug logs with the latest vagrant version. But there's an opaque error with vmrun that seems to stall vagrant. Here's the output from vagrant --debug up.

Invoking vmrun on its own with the path to the vmx file that vagrant created appears to work fine. The puzzling behavior of VMWare Fusion is that apart from the sparseness of its error message, there's no vmware.log file either.

gkb commented

Additionally, from my limited testing, e100e doesn't appear to work while vmxnet3 does as the value for ethernet0.virtualdev.