Deprecation of `dhclient` breaks cloud-init's network-config.
soupglasses opened this issue · 11 comments
Describe the bug
When running with services.cloud-init.enable = true
and setting network: renderers: [networkd]
inside of services.cloud-init.config
to enable networkd based auto-configuration from cloud-init, it will crash on cloud-init-local.service
due to the missing dependency on dhclient
.
This can be seen when attempting to provision a new server on Hetzner with its http based meta-data service, http://169.254.169.254/hetzner/v1/metadata
.
This crash comes from a dependency inside dhcp.py
in cloud-init where its expected that dhclient should exist.
https://github.com/canonical/cloud-init/blob/b3978cbd6c68c883f5ab02630d8d7fcb220b270c/cloudinit/net/dhcp.py#L69
Crashlog
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,730 - util.py[DEBUG]: Reading from /sys/class/net/ens3/device/features (quiet=False)
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,730 - util.py[DEBUG]: Read 65 bytes from /sys/class/net/ens3/device/features
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,730 - util.py[DEBUG]: Reading from /sys/class/net/ens3/address (quiet=False)
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,730 - util.py[DEBUG]: Read 18 bytes from /sys/class/net/ens3/address
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,730 - util.py[DEBUG]: Reading from /sys/class/net/ens3/device/device (quiet=False)
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,719 - DataSourceHetzner.py[DEBUG]: Running on Hetzner Cloud: serial=28683344
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,719 - util.py[DEBUG]: Reading from /sys/class/net/ens3/name_assign_type (quiet=False)
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,719 - util.py[DEBUG]: Read 2 bytes from /sys/class/net/ens3/name_assign_type
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,719 - util.py[DEBUG]: Reading from /sys/class/net/ens3/address (quiet=False)
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,719 - util.py[DEBUG]: Read 18 bytes from /sys/class/net/ens3/address
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,720 - util.py[DEBUG]: Reading from /sys/class/net/ens3/carrier (quiet=False)
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,720 - __init__.py[DEBUG]: Interface has no carrier: ens3
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,720 - util.py[DEBUG]: Reading from /sys/class/net/ens3/dormant (quiet=False)
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,720 - util.py[DEBUG]: Reading from /sys/class/net/ens3/operstate (quiet=False)
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,720 - util.py[DEBUG]: Read 5 bytes from /sys/class/net/ens3/operstate
Feb 09 19:37:56 localhost systemd[1]: Finished Initial cloud-init job (pre-networking).
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,720 - util.py[DEBUG]: Reading from /proc/683/mountinfo (quiet=False)
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,720 - util.py[DEBUG]: Read 1833 bytes from /proc/683/mountinfo
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,721 - url_helper.py[DEBUG]: [0/1] open 'http://169.254.169.254/hetzner/v1/metadata/instance-id' with {'url': 'http://169.254.169.254/hetzner/v1/metadata/instance-id', 'stream': False, 'allow_redirects': True, 'method': 'GE>
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,722 - connectionpool.py[DEBUG]: Starting new HTTP connection (1): 169.254.169.254:80
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,724 - dhcp.py[DEBUG]: Skip dhclient configuration: No dhclient command found.
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,724 - DataSourceHetzner.py[ERROR]: Bailing, DHCP Exception:
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,724 - handlers.py[DEBUG]: finish: init-local/search-Hetzner: FAIL: no local data found from DataSourceHetzner
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,724 - util.py[WARNING]: Getting data from <class 'cloudinit.sources.DataSourceHetzner.DataSourceHetzner'> failed
Feb 09 19:37:56 localhost cloud-init[683]: 2023-02-09 19:37:56,724 - util.py[DEBUG]: Getting data from <class 'cloudinit.sources.DataSourceHetzner.DataSourceHetzner'> failed
Feb 09 19:37:56 localhost cloud-init[683]: Traceback (most recent call last):
Feb 09 19:37:56 localhost cloud-init[683]: File "/nix/store/l1mw1vick89idmz2dw0qxnva12zrdr7x-cloud-init-22.4/lib/python3.10/site-packages/cloudinit/sources/__init__.py", line 946, in find_source
Feb 09 19:37:56 localhost cloud-init[683]: if s.update_metadata_if_supported(
Feb 09 19:37:56 localhost cloud-init[683]: File "/nix/store/l1mw1vick89idmz2dw0qxnva12zrdr7x-cloud-init-22.4/lib/python3.10/site-packages/cloudinit/sources/__init__.py", line 828, in update_metadata_if_supported
Feb 09 19:37:56 localhost cloud-init[683]: result = self.get_data()
Feb 09 19:37:56 localhost cloud-init[683]: File "/nix/store/l1mw1vick89idmz2dw0qxnva12zrdr7x-cloud-init-22.4/lib/python3.10/site-packages/cloudinit/sources/__init__.py", line 373, in get_data
Feb 09 19:37:56 localhost cloud-init[683]: return_value = self._get_data()
Feb 09 19:37:56 localhost cloud-init[683]: File "/nix/store/l1mw1vick89idmz2dw0qxnva12zrdr7x-cloud-init-22.4/lib/python3.10/site-packages/cloudinit/sources/DataSourceHetzner.py", line 59, in _get_data
Feb 09 19:37:56 localhost cloud-init[683]: with EphemeralDHCPv4(
Feb 09 19:37:57 localhost cloud-init[683]: File "/nix/store/l1mw1vick89idmz2dw0qxnva12zrdr7x-cloud-init-22.4/lib/python3.10/site-packages/cloudinit/net/ephemeral.py", line 338, in __enter__
Feb 09 19:37:57 localhost cloud-init[683]: return self.obtain_lease()
Feb 09 19:37:57 localhost cloud-init[683]: File "/nix/store/l1mw1vick89idmz2dw0qxnva12zrdr7x-cloud-init-22.4/lib/python3.10/site-packages/cloudinit/net/ephemeral.py", line 363, in obtain_lease
Feb 09 19:37:57 localhost cloud-init[683]: leases = maybe_perform_dhcp_discovery(
Feb 09 19:37:57 localhost cloud-init[683]: File "/nix/store/l1mw1vick89idmz2dw0qxnva12zrdr7x-cloud-init-22.4/lib/python3.10/site-packages/cloudinit/net/dhcp.py", line 72, in maybe_perform_dhcp_discovery
Feb 09 19:37:57 localhost cloud-init[683]: raise NoDHCPLeaseMissingDhclientError()
Feb 09 19:37:57 localhost cloud-init[683]: cloudinit.net.dhcp.NoDHCPLeaseMissingDhclientError
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,724 - main.py[DEBUG]: No local datasource found
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,725 - util.py[DEBUG]: Reading from /sys/class/net/lo/address (quiet=False)
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,726 - util.py[DEBUG]: Read 18 bytes from /sys/class/net/lo/address
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,726 - util.py[DEBUG]: Reading from /sys/class/net/ens3/address (quiet=False)
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,726 - util.py[DEBUG]: Read 18 bytes from /sys/class/net/ens3/address
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,726 - util.py[DEBUG]: Reading from /sys/class/net/ens3/name_assign_type (quiet=False)
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,726 - util.py[DEBUG]: Read 2 bytes from /sys/class/net/ens3/name_assign_type
Feb 09 19:37:57 localhost cloud-init[683]: 2023-02-09 19:37:56,726 - util.py[DEBUG]: Reading from /sys/class/net/ens3/address (quiet=False)
Steps To Reproduce
Steps to reproduce the behavior:
- Build the minimal configuration (see additional context for this)
- Deploy it to Hetzner (I personally used nixos-anywhere for this)
- Reboot.
Expected behavior
Cloud-init can successfully generate ipv4 and ipv6 output for networkd.
Additional context
Full example of configuration that led to this error
{
modulesPath,
lib,
...
}: {
imports = [
"${modulesPath}/profiles/qemu-guest.nix"
];
config = {
# ... drive & boot configuration ...
networking.useDHCP = false;
networking.useNetworkd = true;
networking.usePredictableInterfaceNames = lib.mkForce false;
services.cloud-init.enable = true;
services.cloud-init.network.enable = true;
networking.hostName = lib.mkDefault "";
# Attempt to limit down the root-cause. May/may-not be needed.
systemd.services.cloud-init.enable = false;
systemd.targets.cloud-config.requires = lib.mkForce ["cloud-init-local.service"];
services.cloud-init.config = ''
system_info:
distro: nixos
network:
renderers: [networkd]
users:
- root
disable_root: false
preserve_hostname: false
cloud_init_modules:
- set_hostname
- update_hostname
- update_etc_hosts
cloud_config_modules: []
cloud_final_modules: []
datasource_list:
- DataSourceHetzner
'';
};
}
Workaround with overlay to get this functionality working
(_final: prev: {
cloud-init = prev.cloud-init.overrideAttrs (_prev: {
makeWrapperArgs = [
"--prefix PATH : ${
prev.lib.makeBinPath [
prev.dmidecode
prev.cloud-utils.guest
(prev.dhcp.override { withClient = true; })
]
}"];
});
})
Notify maintainers
@mweinelt from dhcp/default.nix
git blame from the breaking change.
@phile314 @illustris from cloud-init/default.nix
package maintainers.
Metadata
Please run nix-shell -p nix-info --run "nix-info -m"
and paste the result.
- system: `"x86_64-linux"`
- host os: `Linux 5.15.90, NixOS, 22.11 (Raccoon), 22.11.20230209.dirty`
- multi-user?: `no`
- sandbox: `yes`
- version: `nix-env (Nix) 2.11.1`
- nixpkgs: `not found`
Please open an issue upstream, the ISC dhcp client (dhclient) has reached its end of life. The server component as well, and we ought to remove both of them some time soon.
Please open an issue upstream
I was scared of this, but it is the correct thing to do. However I did want to bring it up here first, as Canonical is not exactly known for removing end-of-life dependencies in a reasonable amount of time.
This may be broken for a while if were to wait on upstream.
Great, they just added a notice about the end of life into the package description. That will accomplish nothing.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1023595
https://salsa.debian.org/debian/isc-dhcp/-/merge_requests/5
I agree that opening an issue with canonical on launchpad is a fools errant. And cloud-init seems to have quite a bit of code around dhclient, which would need to be rewritten for e.g. dhcpcd.
Opened an issue, keeping hopes up.
https://bugs.launchpad.net/cloud-init/+bug/2006784
It's better than doing nothing, and lets search indexes index the error such that it is better seen when this inevitably starts to affect more and more people.
I am honestly considering building mini-script which could generate the same network setup as cloud-init does, but without using dhclient
.
This could let you set network: disabled
in cloud-init, and still have networking set up by the mini-script. But this is not a good solution to the core issue.
Did you take a look at Fedora and other distros? It's often helpful to see if they have patches we can reuse or what direction they are taking. I haven't looked at the problem in detail but it seems like the simplest solution here would be to patch the python code to use another DHCP client instead. It would be good if they could use networkd directly.
From the opened issue. They have an ongoing refactoring to replace their net module all together. And following what was said there, its goal is to:
shift the duty of knowing what is the correct dhcp client onto the distro classes
Issue did get closed as invalid, my guess is that i said NixOS as the distro, which made it an auto-close, even if it was said to be a valid issue.
Not quite sure what to make of this. As it is first said to be on the distro level to solve with the refactor, but also it is a valid problem, but also got closed as invalid.
Did you take a look at Fedora and other distros?
Yeah, Fedora runs their own cloud-init they call Ignition. They do also package cloud-init, but its depending on dhclient for its dhcp step, which Fedora seems to run their own patchset of.
I also know that Clear Linux runs their own version too, called micro-config.
I also found this meta issue that tracks the deprecation of isc-dhcp-client across the board: https://bugs.launchpad.net/ubuntu/+source/maas/+bug/1717983. From there, I found this patchset that seems to replace dhclient with networked in cloud-init: https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/331664 (edit: just noticed 2017 in the patch date)
I am currently working on a patch that allows cloud-init to utilize udhcpc instead of dhclient. Udhcpc, which is coming from busybox and is already employed by initrd-network, offers a potential alternative solution. However, I am not sure this kind of patch would be accepted upstream.
I am currently working on a patch that allows cloud-init to utilize udhcpc instead of dhclient. Udhcpc, which is coming from busybox and is already employed by initrd-network, offers a potential alternative solution. However, I am not sure this kind of patch would be accepted upstream.
If by upstream, you meant NixOS, I am all for it as this is a nasty problem.
If by upstream, you meant cloud-init, I have no opinion :D.