canonical/microcloud

microcloud init error: Failed to add local storage pool: Some peers don't have an available disk

Closed this issue · 6 comments

This is after the disks were discovered by init then selected for local storage and for being wiped.

The disks were previously used but other than that appear in good condition. I wiped them with fdisk -w and wipefs --all prior to attempting the install and failed several times. Before this last attempt I create GPT partition tables.

It would help to know exactly what ZFS and/or Ceph are looking for to qualify a disk?

Thanks

For ZFS storage, each system needs to have 1 disk selected. If you have not selected a disk for each system, then you get the error you mentioned. So if microcloud init discovers 2 additional systems, then you will need to select 3 disks, one that is local to system1, one for system2 and one for system3.

For Ceph storage, at least 3 systems need to have disks selected, and you can also select more than one disk per system, as long as the constraint of 3 systems having disks is met.

Another detail is MicroCloud does not support partitions yet, and will not display them. So you will have to delete any partitions that exist on those disks before running microcloud init.

Apologies for not being clear.

After microcloud init presented the disk table for local storage, I did select one disk for each of the three systems. It then presented me with another table with only the selected disks, to pick those to be wiped. At which point I selected all. Then I get that error message.

Isn't it strange that after recognizing the existence of the disks and presumably after having wiped them it goes on to tell me it can't find one or more disks?

Anyway, I got back to this a couple of hours ago to try to create a ZFS vdev on each of the target disks and I had no problems. So I removed and reinstalled the snaps to restart the process. I declined to install both local and remote storage. Then I went on each member of the cluster and successfully created one OSD on each machine, on the same disks. Now I have a different problem: I can't seem to find the right incantation to get LXD to create a remote storage on Ceph.

Thanks for helping.

So to describe how the init process works, MicroCloud just collects information until it begins setting up the daemons and forming the cluster, so you don't have to worry about anything happening to the disks until the cluster starts forming. In fact it's actually LXD that handles wiping the disks in the end for local storage, and MicroCeph that handles it for remote storage.

The reason you see the disk selection and wipe tables back-to-back is because all the validation happens at the end the segment. So MicroCloud records that you want a certain set of disks in your cluster, then updates that record if any of them should be wiped, and finally checks if the whole set fits the constraints that MicroCloud requires.

If you encounter that error again, please post the whole output from the microcloud init command so I can check if there's something up with how we're counting disks.


If you want to set disks up manually, here's basically what MicroCloud instructs LXD to do:

# For Ceph:
microceph disk add ${disk_on_system1} --wipe
microceph disk add ${disk_on_system2} --wipe
microceph disk add ${disk_on_system3} --wipe

lxc storage create remote ceph source=lxd_remote --target ${system1}
lxc storage create remote ceph source=lxd_remote --target ${system2}
lxc storage create remote ceph source=lxd_remote --target ${system3}
lxc storage create remote ceph ceph.rbd.du=false ceph.rbd.features=layering,striping,exclusive-lock,object-map,fast-diff,deep-flatten


# For ZFS:
lxc storage create local zfs source=${disk_on_system1} source.wipe=true --target ${system1}
lxc storage create local zfs source=${disk_on_system2} source.wipe=true --target ${system2}
lxc storage create local zfs source=${disk_on_system3} source.wipe=true --target ${system3}
lxc storage create local zfs

Thank you very much for that explanation. There may be some sense to the madness.

First a clarification. I'm working with 4 servers, one of them mostly intended to run as a mgr node, with no local storage, but expecting to use remote storage.

So, from your description of how init works, it seems very likely that the peers in question is only that mgr node. So I think we're good on that one.

The way you're creating remote on ceph is not what I expected. I may have misunderstood this note:

This behavior is different for Ceph-based storage pools (ceph, cephfs and cephobject) where each storage pool exists in one central location and therefore, all cluster members access the same storage pool with the same storage volumes.
https://documentation.ubuntu.com/lxd/en/latest/howto/storage_pools/

It is also definitely not presented that way in the online documentation.

But that way does make sense to me for ZFS local storage. Unfortunately, when I gave that last command I got:

Error: Pool not defined on nodes: oscar

oscar being my mgr node. The error was unexpected since I did not choose a disk for the node. I certainly assumed from the context that it was understood that the node did not participate in storage. It seems to be all in or all out.

In the end, misundertanding and misreading of the available docs.

That was a very helpful post, sir! Thank you.

I would still like to know what ZFS and Ceph are looking for before using a disk. It would help to explain that instead of, or at least at the same time as, simply prescribing some tool or another.

In a LXD cluster, each and every member of the cluster needs to have a definition for the storage pool. Pool not defined on nodes: oscar is a result of that constraint not being met. So your premise of "all in or all out" is correct.

First a clarification. I'm working with 4 servers, one of them mostly intended to run as a mgr node, with no local storage, but expecting to use remote storage.

So, from your description of how init works, it seems very likely that the peers in question is only that mgr node. So I think we're good on that one.

Yes, so when running microcloud init using 4 systems, when setting up local storage you would need 4 disks (1 per system). This is just a constraint of the init process of MicroCloud, so when creating the storage pools manually in LXD you don't necessarily need a disk per system, but you do still need to define the storage pool on every system.

Looking at the prior block in the docs:

For most storage drivers, the storage pools exist locally on each cluster member. That means that if you create a storage volume in a storage pool on one member, it will not be available on other cluster members.

In the ZFS case, this just means each system will set up its own local zpool. In LXD we allow creating a local zpool on the filesystem without specifying a disk (source=/path/to/disk is not required), but microcloud init does actually expect the zpool to use a whole disk. The commands as I laid them out are the equivalent to what microcloud init does after each system in a 3-member MicroCloud has joined the LXD cluster.

This behavior is different for Ceph-based storage pools (ceph, cephfs and cephobject) where each storage pool exists in one central location and therefore, all cluster members access the same storage pool with the same storage volumes.

I suppose "exists" here should be changed to "can exist". With Ceph, the storage pool can be fully remote on some systems, and those systems should still be able to access the pool's storage volumes. So the 1-to-1 constraint of disks-to-systems isn't necessary. That said, microcloud init has a further constraint that at least 3 systems are required to specify a disk for high availability.

The catch is that in both the ZFS and Ceph cases, LXD still needs to have a per-system definition for the storage pool, so that you can launch/move/copy containers/vms or storage pool volumes on any cluster member. That means regardless of whether you are assigning disks per-system or not, you will still need to run lxc storage create --target ${server_name} on each and every cluster member.
This is mentioned at the top of the docs for storage pool configuration in a cluster: https://documentation.ubuntu.com/lxd/en/latest/howto/storage_pools/#create-a-storage-pool-in-a-cluster

If you are running a LXD cluster and want to add a storage pool, you must create the storage pool for each cluster member separately. The reason for this is that the configuration, for example, the storage location or the size of the pool, might be different between cluster members.

Therefore, you must first create a pending storage pool on each member with the --target=<cluster_member> flag and the appropriate configuration for the member.


So in your case if you want to have an mgr system which doesn't have any disks, then you can do something like this:

# For Ceph:
# Skip adding a disk to microceph on "mgr".
microceph disk add ${disk_on_system1} --wipe
microceph disk add ${disk_on_system2} --wipe
microceph disk add ${disk_on_system3} --wipe


lxc storage create remote ceph source=lxd_remote --target "mgr" # Still create the Ceph storage pool on "mgr".
lxc storage create remote ceph source=lxd_remote --target ${system1}
lxc storage create remote ceph source=lxd_remote --target ${system2}
lxc storage create remote ceph source=lxd_remote --target ${system3}
lxc storage create remote ceph ceph.rbd.du=false ceph.rbd.features=layering,striping,exclusive-lock,object-map,fast-diff,deep-flatten


# For ZFS:

lxc storage create local zfs --target "mgr" # Create the ZFS pool directly on the filesystem on "mgr".
lxc storage create local zfs source=${disk_on_system1} source.wipe=true --target ${system1}
lxc storage create local zfs source=${disk_on_system2} source.wipe=true --target ${system2}
lxc storage create local zfs source=${disk_on_system3} source.wipe=true --target ${system3}
lxc storage create local zfs

Fantastic! Pat yourself on the back! Can't do it from where I am. :-)

I think you just saved me a few hours of searching. I was already scratching my head about how to specify the ZFS storage without a "source". I'm sure I would not have thought of it.

On Ceph, I might have eventually tried the first line, and the last I think I saw something like it somewhere.

Very helpful. Even more welcome!

Thanks