Azure/WALinuxAgent

[BUG] deprovision cause exception on RHEL 9.0 Beta

johanburati opened this issue · 14 comments

Working on an image for RHEL9, when I try to deprovision I get an expection:

# cat /etc/redhat-release
Red Hat Enterprise Linux release 9.0 Beta (Plow)

# waagent --version
WALinuxAgent-2.3.0.2 running on rhel 9.0
Python: 3.9.6
Goal state agent: 2.3.0.2

# /usr/sbin/waagent -force -deprovision
WARNING! The waagent service will be stopped.
WARNING! All SSH host key pairs will be deleted.
WARNING! Cached DHCP leases will be deleted.
WARNING! root password will be disabled. You will not be able to login as root.
WARNING! /etc/resolv.conf will be deleted.
2022-05-19T09:18:42.791639Z INFO MainThread Examine /proc/net/route for primary interface
2022-05-19T09:18:42.796461Z INFO MainThread Primary interface is [eth0]
2022-05-19T09:18:42.804226Z ERROR MainThread Failed to run 'deprovision': Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/azurelinuxagent/agent.py", line 247, in main
    agent.deprovision(force, deluser=False)
  File "/usr/lib/python3.9/site-packages/azurelinuxagent/agent.py", line 153, in deprovision
    deprovision_handler.run(force=force, deluser=deluser)
  File "/usr/lib/python3.9/site-packages/azurelinuxagent/pa/deprovision/default.py", line 221, in run
    self.do_actions(actions)
  File "/usr/lib/python3.9/site-packages/azurelinuxagent/pa/deprovision/default.py", line 241, in do_actions
    action.invoke()
  File "/usr/lib/python3.9/site-packages/azurelinuxagent/pa/deprovision/default.py", line 57, in invoke
    self.func(*self.args, **self.kwargs)
  File "/usr/lib/python3.9/site-packages/azurelinuxagent/common/osutil/redhat.py", line 89, in set_dhcp_hostname
    fileutil.update_conf_file(filepath,
  File "/usr/lib/python3.9/site-packages/azurelinuxagent/common/utils/fileutil.py", line 162, in update_conf_file
    conf = read_file(path).split('\n')
  File "/usr/lib/python3.9/site-packages/azurelinuxagent/common/utils/fileutil.py", line 53, in read_file
    with open(filepath, mode) as in_file:
FileNotFoundError: [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-eth0'

Looks like the network configuration is stored in /etc/NetworkManager/system-connections/eth0.nmconnection in this release.

Just in case I tried with the latest release v2.7.0.6, but same issue:

#  /usr/sbin/waagent --version
WALinuxAgent-2.7.0.6 running on rhel 9.0
Python: 3.9.6
Goal state agent: 2.7.0.6

# /usr/sbin/waagent -force -deprovision
WARNING! The waagent service will be stopped.
WARNING! All SSH host key pairs will be deleted.
WARNING! Cached DHCP leases will be deleted.
WARNING! root password will be disabled. You will not be able to login as root.
WARNING! /etc/resolv.conf will be deleted.
2022-05-20T03:05:08.474332Z INFO MainThread Examine /proc/net/route for primary interface
2022-05-20T03:05:08.480664Z INFO MainThread Primary interface is [eth0]
2022-05-20T03:05:08.487732Z ERROR MainThread Failed to run 'deprovision': [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-eth0'
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/agent.py", line 265, in main
    agent.deprovision(force, deluser=False)
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/agent.py", line 155, in deprovision
    deprovision_handler.run(force=force, deluser=deluser)
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/pa/deprovision/default.py", line 221, in run
    self.do_actions(actions)
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/pa/deprovision/default.py", line 241, in do_actions
    action.invoke()
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/pa/deprovision/default.py", line 57, in invoke
    self.func(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/common/osutil/redhat.py", line 89, in set_dhcp_hostname
    fileutil.update_conf_file(filepath,
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/common/utils/fileutil.py", line 162, in update_conf_file
    conf = read_file(path).split('\n')
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/common/utils/fileutil.py", line 53, in read_file
    with open(filepath, mode) as in_file:
FileNotFoundError: [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-eth0'

I tried also with the develop branch since you've posted a fix #2592

# /usr/sbin/waagent --version
WALinuxAgent-9.9.9.9 running on rhel 9.0
Python: 3.9.6
Goal state agent: 9.9.9.9

but still have the issue:

# /usr/sbin/waagent -force -deprovision
WARNING! The waagent service will be stopped.
WARNING! All SSH host key pairs will be deleted.
WARNING! Cached DHCP leases will be deleted.
WARNING! root password will be disabled. You will not be able to login as root.
WARNING! /etc/resolv.conf will be deleted.
2022-05-20T07:00:21.333006Z INFO MainThread Examine /proc/net/route for primary interface
2022-05-20T07:00:21.341866Z INFO MainThread Primary interface is [eth0]
2022-05-20T07:00:21.512733Z ERROR MainThread Failed to run 'deprovision': [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-eth0'
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/agent.py", line 265, in main
    agent.deprovision(force, deluser=False)
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/agent.py", line 155, in deprovision
    deprovision_handler.run(force=force, deluser=deluser)
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/pa/deprovision/default.py", line 221, in run
    self.do_actions(actions)
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/pa/deprovision/default.py", line 241, in do_actions
    action.invoke()
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/pa/deprovision/default.py", line 57, in invoke
    self.func(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/common/osutil/redhat.py", line 89, in set_dhcp_hostname
    fileutil.update_conf_file(filepath,
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/common/utils/fileutil.py", line 162, in update_conf_file
    conf = read_file(path).split('\n')
  File "/usr/local/lib/python3.9/site-packages/azurelinuxagent/common/utils/fileutil.py", line 53, in read_file
    with open(filepath, mode) as in_file:
FileNotFoundError: [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-eth0'

@johanburati I'm unable to repro this. I tested in cloud-init provisioning agent and it's working fine and file is present. File content says

# cat /etc/sysconfig/network-scripts/ifcfg-eth0 
# Created by cloud-init on instance boot automatically, do not edit.
#
AUTOCONNECT_PRIORITY=999
BOOTPROTO=dhcp
DEVICE=eth0
HWADDR=60:45:bd:86:59:f9
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
DHCP_HOSTNAME=localhost.localdomain

when do you see this exception? are you validating waagent provision/deprovision path? can share repro steps?

when do you see this exception? are you validating waagent provision/deprovision path? can share repro steps?

I just built the image on my workstation using the DVD from Redhat, upload the VHD to Azure and create a VM by attaching the VHD. From there I was planning to run waagent -deprovision and then create an image.

I was planning to use waagent for provisioning so I did not install cloud-init.

If I install the cloud-init package it works because cloud-ini creates the file in network-scripts but shouldn't waagent handles that situation without depending on cloud-init ?

The marketplace images seems working. We don't know what custom image that you are using. I couldn't repro issue. What type of image is this "DVD from Redhat". Is it marketplace version of image or something else?

What type of image is this "DVD from Redhat". Is it marketplace version of image or something else?

It means you download the DVD iso for RHEL9 from Redhat website -> https://developers.redhat.com/products/rhel/download

The marketplace images seems working. We don't know what custom image that you are using.

I download the DVD from Redhat as described above and install it on a VM on my PC, then I've upload the VHD to Azure.
Only difference with the marketplace images is that I don't install the cloud-init package.

If cloud-init is installed it will create the files under network-scripts folder and WALinuxAgent will not throw an expection.
If cloud-init is not installed then the files under network-scripts won't be there and WALinuxAgent will throw an expection.

@narrieta IMHO WALinuxAgent should be able to handle the provisioning by itself without requiring cloud-init to be installed.

@johanburati The main purpose of the "deprovision" option in waagent is to help prepare a generalized image from an image in the marketplace where the agent has already been installed. The images in the marketplace use cloud-init and that case seems to be handled correctly.

If you can provide detailed instructions to reproduce the problem you are seeing, we may be able to help.

If you can provide detailed instructions to reproduce the problem you are seeing, we may be able to help.

The instructions are the one you can find in:

In that doc, It is specified one need to deprovision the VM:

  1. Deprovision
    Run the following commands to deprovision the virtual machine and prepare it for provisioning on Azure:

When you do waagent throw an expection:

FileNotFoundError: [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-eth0'

Is it expected that waagent throw an expection if it does not find the /etc/sysconfig/network-scripts/ifcfg-eth0 file ?

My point is the network-scripts are not used on RHEL9, the Markteplace image does not throw the error because it has cloud-init package installed and cloud-init generate that file.

Shouldn't waagent able to handle the deprovisionning/provisionning of the VM wihtout requiring cloud-init ?
btw if that is not the case then the WALinuxAgent rpm should have the cloud-init package as a requirement.

Hi @narrieta ,

I also see this issue in RHEL-9.0. This issue is happened when there's no /etc/sysconfig/network-scripts/ifcfg-eth0 in the image. The ifcfg-eth0 is not a necessary file in both RHEL-9 and RHEL-8 now, and the network-script . I think missing this file should not block the WALA provisioning.
Besides, if there's no ifcfg-eth0 file, wagent -deprovision command will also through an exception:
2022-05-31T08:12:29.760610Z ERROR MainThread Failed to run 'deprovision': [Errno 2] No such file or directory: '/etc/sysconfig/network-scripts/ifcfg-eth0'
So even use waagent -deprovision to prepare the generalized image cannot resolve this issue.

Thanks!

@johanburati cloud-init is the recommended/supported way to provision RHEL9. We provide limited support for other scenarios. I'll leave this issue open, but it may takes us some time to get to it.

We don't produce the RPM package. If Red Hat's package is missing the dependency on cloud-init, this should probably be reported to them.

@yuxisun1217 Are you provisioning with cloud-init?

@narrieta No. I hit this issue when I test WALA feature in RHEL(without cloud-init). I know that the RHEL images in the Azure Marketplace all use cloud-init as provision agent, but for the customized images I think the customer may hit this issue.