linux-nvme/nvme-stas

Use auto-generated hostnqn as a fallback

tbzatek opened this issue · 7 comments

We're facing a distribution packaging specific issue where we cannot afford to provide unique /etc/nvme/hostnqn or /etc/nvme/hostid files for various reasons (e.g. generic pre-built rootfs image). This is typically not a problem for nvme-cli and libnvme-based tools as a stable hostnqn is autogenerated as a fallback. Not so much for hostid that is often missing, that was not really a problem either.

However nvme-stas demands those files to exist unless hostnqn or hostid are specified in sys.conf.

nvme-stas/staslib/stas.py

Lines 351 to 360 in 45c1985

def hostnqn(self):
'''@brief return the host NQN
@return: Host NQN
@raise: Host NQN is mandatory. The program will terminate if a
Host NQN cannot be determined.
'''
try:
value = self.__get_value('Host', 'nqn', '/etc/nvme/hostnqn')
except FileNotFoundError as ex:
sys.exit('Error reading mandatory Host NQN (see stasadm --help): %s', ex)

You're basically asking for post-installation configuration to be executed at run time. Sorry to be blunt, but that's the kind of stuff that makes my skin crawl.

The hostnqn and hostid are just the tip of the iceberg. We are currently working on security, which will require users to configure unique authentication keys. BTW, the hostid is now mandatory because of explicit registration with Central Discovery Controllers (CDC). The hostid and hostnqn must not only be unique per host, but also remain constant between reboots and software upgrades so that a host can consistently identify itself to a CDC.

Pre-built rootfs images are a pain when it comes to security. For example, we had customers using pre-built rootfs images that ended up with the same SSH keys on all their systems. We ended up giving them a script that they could run on the first boot that would fix all the configuration.

It's a slippery slope when upstream projects have to fix problems artificially created by users. nvme-stas is not the only project that requires post-installation configuration. Are all upstream projects that require post-installation configuration going to be asked to do post-installation configuration at run time (e.g. openssh)? I'm just curious.

I'll have to check with all the interested parties how we want to address this. Like I said, it's not just the hostid and hostnqn. There's a whole bunch of security-related parameters that are coming and everything should be handled the same way. I don't want to have to maintain a multitude of different config scripts.

Pre-built rootfs images are a pain when it comes to security. For example, we had customers using pre-built rootfs images that ended up with the same SSH keys on all their systems. We ended up giving them a script that they could run on the first boot that would fix all the configuration.

Those people are using broken image build tools. For Fedora CoreOS and derivatives like RHEL CoreOS, our images generate SSH keys on first boot in the same way as traditional dpkg/yum type systems.

nvme-stas is not the only project that requires post-installation configuration. Are all upstream projects that require post-installation configuration going to be asked to do post-installation configuration at run time (e.g. openssh)? I'm just curious.

Yes exactly: in Fedora derivatives for example, ssh key generation has happened via a systemd unit for many years now. Though it looks like that's not the case on Debian derivatives currently.

Thanks @cgwalters - I was not aware of this special service to generate ssh keys on Fedora. I like this approach and will look at implementing something similar for nvme-stas.

P.S. Yes, I am a Debian developer, but I'm slowly moving towards Fedora. Still have much to learn. 😉

Thanks @martin-belanger, this insight into upcoming requirements and security tightening is essential for integration. We are talking mostly about corner cases here where the regular hostnqn and hostid files generation didn't happen during initial setup. This is not an issue with a traditional OS installation process.

Let me ask you one more curious question - is nvme-stas ever planned to be run within initramfs phase? (dependencies aside...) This opens bunch of questions in case where we can't modify the initramfs image (think of e.g. secure boot signing) and need to find a way to supply the required IDs and keys from the outside. Just thinking ahead.

My original motivation behind this request was to avoid the necessity to pollute the OS with extra systemd unit. Unlike the ssh key generation and iscsi initiator name setup that are pre-requisites for a specific daemon, there's no such thing for nvme, except of an early boot systemd target. Think for example about nvme-cli calls spawned from udev rules, here we want to have consistent IDs used in the fabrics network.

There were talks to have nvme-stas run during early boot. I know you said "dependencies aside...", but the dependencies are the main reason why nvme-stas cannot run during early boot. The most important ones are 1) the need for a mDNS engine (avahi) to perform automatic discovery of Discovery Controllers, and 2) the need for the dbus-daemon to be up and running so that stafd, stacd, and the avahi-daemon can all talk to each other.

In order to solve these dependency problems, one would have to define and implement a new communication protocol between stafd and stacd other than D-Bus, and implement a mDNS engine that can run at early boot. That's a lot of work.

You are correct about the problem we face during early boot. Where do the unique IDs come from? Do they get read out of the hardware itself by some magic trick?

What I like about using a oneshot service to generate the IDs is that we can use the ConditionFileNotEmpty=|! parameter to verify that all the necessary files needed by stafd and stacd are present before even starting them. For example:

[Unit]
Description=nvme-stas pre-exec configuration verification
ConditionFileNotEmpty=|!/etc/nvme/hostid
ConditionFileNotEmpty=|!/etc/nvme/hostnqn

[Service]
Type=oneshot
EnvironmentFile=-/etc/sysconfig/sshd
ExecStart=/usr/bin/stasadm init

But I guess we could also run a script using ExecStartPre= in both stacd.service and stafd.service to do the same. I'm still looking at the different options.

Fixed by: #154

Thanks! Tested and confirmed working.