Cluster of 3 machines, only one can start VM
benoitjpnet opened this issue · 10 comments
I have just created a new cluster of 3 machines.
I created a VM on each LXD server.
But, only one is able to start the VM. The others will result in:
lxc start u2
Error: Failed setting up disk device "root": Failed to open "/etc/ceph/ceph.client.admin.keyring": open /etc/ceph/ceph.client.admin.keyring: no such file or directory
For some reason containers are fine, only VMs are affected. This is odd since containers also use Ceph.
root@mc10:~# find / -iname ceph.client.admin.keyring
/var/snap/microceph/707/conf/ceph.client.admin.keyring
root@mc10:~#
It is indeed missing on the 2 other servers:
root@mc11:~# find / -iname ceph.client.admin.keyring
root@mc11:~#
Sounds like a MicroCeph issue. @UtkarshBhatthere @sabaini looks like the conf
directory is missing on some systems. Got any ideas here?
For starters, the error message says Failed to open /etc/ceph/...
while it should be /var/snap/microceph/current/...
For starters, the error message says
Failed to open /etc/ceph/...
while it should be/var/snap/microceph/current/...
This is just a quirk of LXD, which symlinks /var/snap/microceph/current/conf
into /etc/ceph
to support ceph from both microceph and normal host install.
However it seems that on mc11
, there is no keyring at all:
root@mc11:~# find / -iname ceph.client.admin.keyring
root@mc11:~#
@benoitjpnet Could you please post the result of the following 2 commands on mc11
:
# Checks to see if microceph and lxd have connected properly.
snap connections lxd
# Checks to see if the symlink has been properly set up inside the snap confinement for LXD.
snap run --shell lxd -c "aa-exec -p unconfined ls -l /etc/ceph"
root@mc11:~# snap connections lxd
Interface Plug Slot Notes
content[ceph-conf] lxd:ceph-conf microceph:ceph-conf -
lxd microcloud:lxd lxd:lxd -
lxd-support lxd:lxd-support :lxd-support -
network lxd:network :network -
network-bind lxd:network-bind :network-bind -
system-observe lxd:system-observe :system-observe -
root@mc11:~#
snap run --shell lxd -c "aa-exec -p unconfined ls -l /etc/ceph"
lrwxrwxrwx 1 root root 33 Nov 30 11:02 /etc/ceph -> /var/snap/microceph/current/conf/
Key is missing on mc11 and mc12. The key is present only on the node where I initialized the cluster, mc10. Note that I am able to reproduce the issue with a fresh install.
root@mc10:~# find / -iname ceph.client.admin.keyring
/var/snap/microceph/707/conf/ceph.client.admin.keyring
root@mc11:~# find / -iname ceph.client.admin.keyring
root@mc11:~#
root@mc12:~# find / -iname ceph.client.admin.keyring
root@mc12:~#
@UtkarshBhatthere happy if I assign this issue to you?
I have reproduced this, will check into it.
I this the issue is not assigned. May I know if this is still being tracked?