ceph/ceph-ansible

monitor keys fail to generate except on first monitor

whinis opened this issue · 4 comments

Bug Report

What happened:
'Error ENOENT: failed to find mgr.localhost in keyring'

What you expected to happen:

How to reproduce it (minimal and precise):
I am setting up a ceph cluster on rocky 9, I have tried stable-7 and stable-8 both with some sort of key errors. For Stable-7 its every monitor except for the first ran cannot find mgr.localhost which I can confirm doesn't exists on any monitor except the first.

Share your group_vars files, inventory and full ceph-ansibe log

Environment:

  • OS (e.g. from /etc/os-release): Rocky Linux 9.3
  • Kernel (e.g. uname -a): Linux localhost.localdomain 5.14.0-362.24.1.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Mar 13 17:33:16 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
  • Docker version if applicable (e.g. docker version):
  • Ansible version (e.g. ansible-playbook --version): 2.15.9
  • ceph-ansible version (e.g. git head or tag or stable branch): stable-7.0
  • Ceph version (e.g. ceph -v): ceph version 17.2.7

it looks like ansible_facts['hostname'] for the node 192.168.30.8 returns localhost... how is configured the hostname on that node?

from the log you provided:

2024-03-16 18:25:02,742 p=322679 u=john n=ansible | TASK [ceph-facts : set_fact monitor_name ansible_facts['hostname']] ************************************************************************************************************************************************************************
2024-03-16 18:25:02,785 p=322679 u=john n=ansible | ok: [192.168.30.8] => (item=192.168.30.8) => changed=false 
  ansible_facts:
    monitor_name: localhost
  ansible_loop_var: item
  item: 192.168.30.8
2024-03-16 18:25:02,794 p=322679 u=john n=ansible | ok: [192.168.30.8 -> 192.168.30.6] => (item=192.168.30.6) => changed=false 
  ansible_facts:
    monitor_name: localhost
  ansible_loop_var: item
  item: 192.168.30.6
2024-03-16 18:25:02,804 p=322679 u=john n=ansible | ok: [192.168.30.8 -> 192.168.30.7] => (item=192.168.30.7) => changed=false 
  ansible_facts:
    monitor_name: localhost
  ansible_loop_var: item
  item: 192.168.30.7

this means it's probably a bad configuration with your environment

closing this as per #7518