canonical/microcloud

microcloud init failed to get system resources of peer

pclerie opened this issue · 1 comments

Good day all,

Attempting to get going with MicroCloud, microcloud init consistently fails with the same error every single time for the past three days and numerous reinstalls. Below is a run with latest/edge packages where possible.

Hardware:
RPi 4 / 8GB RAM (6 units)
1TB SSD (intended for Ceph)
64GB Flash (intended for local storage)
eth0 for cluster traffic
Wifi for everything else
Ubuntu 22.04

ladmin@beta:~$ snap list
Name        Version                 Rev    Tracking       Publisher   Notes
core20      20230801                2019   latest/stable  canonical✓  base
core22      20230801                867    latest/stable  canonical✓  base
lxd         git-35860e8             26005  latest/edge    canonical✓  -
microceph   0+git.66190c4           716    latest/edge    canonical✓  -
microcloud  git-813e9d8             662    latest/edge    canonical✓  -
microovn    22.03.3+snapa304000172  278    22.03/stable   canonical✓  -
snapd       2.60.4                  20298  latest/stable  canonical✓  snapd

ladmin@beta:~$ sudo microcloud init
Waiting for LXD to start...
Select an address for MicroCloud's internal traffic:

 Using address "172.28.15.12" for MicroCloud

Limit search for other MicroCloud servers to 172.28.15.12/24? (yes/no) [default=yes]:
Scanning for eligible servers ...

 Selected "delta" at "172.28.15.14"
 Selected "gamma" at "172.28.15.13"
 Selected "kappa" at "172.28.15.15"
 Selected "theta" at "172.28.15.16"
 Selected "beta" at "172.28.15.12"
 Selected "iota" at "172.28.15.17"

Error: Failed to get system resources of peer "iota": Get "https://172.28.15.17:9443/1.0/services/lxd/1.0/resources": Unable to connect to: 172.28.15.17:9443 ([dial tcp 172.28.15.17:9443: i/o timeout])

The relevant MicroCloud logs:

Oct 14 10:40:23 beta systemd[1843]: Started snap.microcloud.microcloud-a7bacd97-8ee8-4655-a95e-ebabcbe8feb4.scope.
Oct 14 10:40:38 beta systemd[1]: Started snap.microcloud.microcloud-ac0b2642-eba3-402f-bf68-196c56f5db30.scope.
Oct 14 10:41:08 beta kernel: [13307.889258] audit: type=1400 audit(1697294468.091:73): apparmor="DENIED" operation="open" profile="snap.microcloud.microcloud" name="/var/lib/snapd/hostfs/etc/ssl/certs/ca-certificates.crt" pid=3221 comm="microcloud" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
Oct 14 10:41:28 beta systemd[1]: snap.microcloud.microcloud-ac0b2642-eba3-402f-bf68-196c56f5db30.scope: Deactivated successfully.
Oct 14 10:41:28 beta systemd[1]: snap.microcloud.microcloud-ac0b2642-eba3-402f-bf68-196c56f5db30.scope: Consumed 1.431s CPU time.

And last:

ladmin@iota:~$ ls -la /var/lib/snapd/hostfs
total 8
drwxr-xr-x  2 root root 4096 May 29 08:08 .
drwxr-xr-x 23 root root 4096 Oct 14 07:13 ..
ladmin@beta:~$ ls -la /var/lib/snapd/hostfs
total 8
drwxr-xr-x  2 root root 4096 May 29 08:08 .
drwxr-xr-x 23 root root 4096 Oct 14 07:12 ..

It looks like init is unable to start the necessary HTTPS session because it appears that all the nodes are missing stuff under /var/lib/snapd/hostfs. What is missing? How do I recover?

Thanks for any clues.

Problem was a network interface configuration error.