How to allow the *parent* NetData to collect metrics?
fzyzcjy opened this issue · 8 comments
Thanks for the lib! I want to collect metrics such as Redis and MySQL. We know netdata helmchart cannot automatically discover those redis instances, so I need to write down the location (url) of each redis instance in the configuration.
I do it in the parent data, since parent is a deployment with only 1 replica. If I do it in child netdata, then each child will collect the redis/mysql metrics once, thus the collecting is duplicated.
However, I cannot make it work. What I have tried are as follows. (I am trying mysql; redis not done yet)
When go into the parent netdata container and execute command su -s /bin/bash netdata && cd /usr/libexec/netdata/plugins.d/ && ./go.d.plugin -d -m mysql
, I do see all those metrics happily collected.
But when looking at my http://url:19999, I see only the following. No MySQL metrics.
Configuration of values.yaml:
parent:
configs:
netdata:
enabled: true
path: /etc/netdata/netdata.conf
data: |
[global]
memory mode = dbengine
page cache size = 32
dbengine multihost disk space = 4000 # unit is MiB
bind to = 0.0.0.0:19999
[plugins]
cgroups = no
tc = no
enable running new plugins = yes # NOTE
check for new plugins every = 72000
python.d = yes # NOTE
charts.d = no
go.d = yes # NOTE
node.d = no
apps = no
proc = no
idlejitter = no
diskspace = no
godconf:
enabled: true
path: /etc/netdata/go.d.conf
data: |
# netdata go.d.plugin configuration
#
# This file is in YAML format.
# Enable/disable the whole go.d.plugin.
enabled: yes
# Enable/disable default value for all modules.
default_run: yes
# Maximum number of used CPUs. Zero means no limit.
max_procs: 0
# Enable/disable specific g.d.plugin module
# If you want to change any value, you need to uncomment out it first.
# IMPORTANT: Do not remove all spaces, just remove # symbol. There should be a space before module name.
modules:
# activemq: yes
# apache: yes
# bind: yes
# cockroachdb: yes
# consul: yes
# coredns: yes
# couchbase: yes
# couchdb: yes
# dnsdist: yes
# dnsmasq: yes
# dnsmasq_dhcp: yes
# dns_query: yes
# docker_engine: yes
# dockerhub: yes
# elasticsearch: yes
# example: no
# filecheck: yes
# fluentd: yes
# freeradius: yes
# hdfs: yes
# httpcheck: yes
# isc_dhcpd: yes
# k8s_kubelet: yes
# k8s_kubeproxy: yes
# lighttpd: yes
# lighttpd2: yes
# logstash: yes
mysql: yes # NOTE XXX
# nginx: yes
# nginxvts: yes
# openvpn: yes
# phpdaemon: yes
# phpfpm: yes
# pihole: yes
# pika: yes
# portcheck: yes
# powerdns: yes
# powerdns_recursor: yes
# prometheus: yes
# pulsar: yes
# rabbitmq: yes
# redis: yes
# scaleio: yes
# solr: yes
# springboot2: yes
# squidlog: yes
# systemdunits: yes
# tengine: yes
# unbound: yes
# vernemq: yes
# vcsa: yes
# vsphere: yes
# web_log: yes
# whoisquery: yes
# wmi: yes
# x509check: yes
# zookeeper: yes
mysql:
enabled: true
path: /etc/netdata/go.d/mysql.conf
data: |
jobs:
- name: local
dsn: '{{ .Values.mysql.username }}:{{ .Values.mysql.password }}@tcp({{ .Values.mysql.host }}:{{ .Values.mysql.port }})/'
# dsn: '{{ .Values.mysql.username }}:{{ .Values.mysql.password }}@tcp({{ .Values.mysql.host }}:{{ .Values.mysql.port }})/{{ .Values.mysql.database }}'
Ah I see the problem... My original configuration is like:
data: |
[global]
# https://learn.netdata.cloud/docs/agent/database/engine/
memory mode = dbengine
# NOTE use the calculator to re-calc how much is needed after having more machines https://learn.netdata.cloud/docs/agent/database/calculator
page cache size = 32
dbengine multihost disk space = 4000 # unit is MiB
bind to = 0.0.0.0:19999
[plugins]
cgroups = no
tc = no
enable running new plugins = yes # NOTE with some comments here
check for new plugins every = 72000
python.d = yes # NOTE with some comments here
charts.d = no
go.d = yes # NOTE with some comments here
node.d = no
apps = no
proc = no
idlejitter = no
diskspace = no
AND IT IS THE PROBLEM WITH COMMENTS! Strangely go.d = yes # NOTE with some comments here
is NOT allowed. and will be regarded as no
......
Change to go.d = yes
then ok.
@fzyzcjy hi
Netdata k8s setup is parent/deployment and children/daemonset. Parent netdata doesn't collect any metrics for a reason - there is netdata child instance on the same host and it is responsible for metric collection.
go.d.plugin static configurations doesn't work in k8s - we rely on service discovery. Service discovery job is to create create/update a configuration file - go.d.plugin watches the file and starts/stops jobs according it.
dsn: '{{ .Values.mysql.username }}:{{ .Values.mysql.password }}@tcp({{ .Values.mysql.host }}:{{ .Values.mysql.port }})/'
I see you want to get mysql instances discovered?
We have this
helmchart/charts/netdata/sdconfig/child.yml
Lines 46 to 47 in 33fc823
It doesn't work, i assume, because there is no user/password. Lets think how we can get it.
How do you set user/pass for mysql? Adding some env variables to your k8s manifests?
How to check created by service discovery go.d configuration file
0 ~ $ kubectl -n mon exec netdata-child-7xbcv -c netdata -- cat /etc/netdata/go.d/sd/go.d.yml
- module: cockroachdb
name: cockroachdb-infra_crdb-cockroachdb-0_db_tcp_8080
url: http://10.1.4.9:8080/_status/vars
- module: prometheus
name: prometheus-infra_redis-cdc-streamer-master-0_metrics_tcp_9121
url: http://10.1.4.7:9121/metrics
update_every: 10
max_time_series: 1000
@ilyam8 Thanks for the reply!
I have successfully mange to do the MySQL part using NetData Parent. I have to use parent, because netdata child run on each host. And my MySQL is an (external) service, not a pod. So if each netdata child scrapes that MySQL service, the mysql will be scraped N times instead of 1 time.
Now the problem, as you have mentioned, comes to the password problem. I am configuring Redis with auto-discovery (see this post: #188 (comment)). However, my redis has password, so that is the problem.
How do you set user/pass for mysql? Adding some env variables to your k8s manifests?
I create a k8s Secret
containing the password. Then I pass the secret as env variable
to whatever pod that wants to consume password. If looking for a concrete example, this simple tutorial: https://medium.com/faun/using-kubernetes-secrets-as-environment-variables-5ea3ef7581ef
I have to use parent, because netdata child run on each host.
Yes, it it the plan.
So if each netdata child scrapes that MySQL service, the mysql will be scraped N times instead of 1 time
No no a child instance collects data from the pods running on the same host only.
See
helmchart/charts/netdata/sdconfig/child.yml
Lines 1 to 6 in 33fc823
local_mode: true
means watch only for local node pod events
So if each netdata child scrapes that MySQL service
Wrong metrics then:
- scrape1: => mysql instance0
- scrape2: => mysql instance1
- scrape3: => mysql instance1
- scrape4: => mysql instance0
- ...
@ilyam8 Thanks for the reply. However, my MySQL is an external service, not a pod. Thus it is not specific to a single node. e.g.
apiVersion: v1
kind: Service
metadata:
name: {{ include "external-mysql.fullname" . }}
labels:
{{- include "external-mysql.labels" . | nindent 4 }}
spec:
type: ExternalName
externalName: {{ .Values.service.externalName }}
ports:
- port: {{ .Values.service.port }}
Anyway, I have managed to do it in parent netdata :)
I see, for that case i added role: service to sd
, i was thinking setup would be - child netdata
with sd
role:pod and parent netdata
with sd
role:service. But we never really tested service discovery for a parent and our helmchart even has no such option.