infrawatch/collectd-config-ansible-role

ceph daemons should be self-configuring

Opened this issue · 3 comments

Ceph daemons are deployed on every ceph node, but they also include global numbers (i.e. each number is unique over the cluster).

E.g

[root@ceph-0 run]# ls -l /var/run/ceph
total 0
srwxr-xr-x. 1 167 167 0 Oct 19 09:05 ceph-osd.0.asok
srwxr-xr-x. 1 167 167 0 Oct 19 09:05 ceph-osd.10.asok
srwxr-xr-x. 1 167 167 0 Oct 19 09:05 ceph-osd.13.asok
srwxr-xr-x. 1 167 167 0 Oct 19 09:05 ceph-osd.4.asok
srwxr-xr-x. 1 167 167 0 Oct 19 09:05 ceph-osd.7.asok

These sockets have to get provided for the ceph plugin https://github.com/collectd/collectd/blob/main/src/collectd.conf.pod#plugin-ceph

Would these not be passed in as config values? which is parsed in:

{% for daemon in collectd_plugin_ceph_daemon %}
<Daemon "{{ daemon.name|e }}">
SocketPath "{{ daemon.socketpath|e }} "
</Daemon>

For cluster configuration, this would be something that would be done when by whoever/whatever is consuming this collectd_config role (e.g. in tripleo-collectd-ansible-role )

These config values are currently fed by tripleo config. However, that would write the same config to all nodes, but e.g osds (numbered 1-15) are spread across 3 nodes, e.g node 1 gets osd 1, 3, 6, 7, and 13.
Collectd doesn't handle non-existing osds very well, and it would be great to just look into /var/lib/ceph/ to add the mgrs and osds from there to the collectd config.