Add support or example for `processors` config section
a03nikki opened this issue · 7 comments
Describe the feature:
Add support or an example for the processors
configuration section within this role.
There is this closed discuss question for this problem that did not have a resolution to this same challenge.
Beats product: Metricbeat
Beats version: 7.6.0
Role version: 7.6.0
OS version (uname -a
if on a Unix-like system):
Beat host: Ubuntu container
root@redacted:/# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.4 LTS"
root@redacted:/# uname -a
Linux 4768ab1b9aa9 4.19.76-linuxkit #1 SMP Thu Oct 17 19:31:58 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Controlling host: MacOS Catalina - 10.15.3 (19D76)
~ % uname -a
Darwin redacted.local 19.3.0 Darwin Kernel Version 19.3.0: Thu Jan 9 20:58:23 PST 2020; root:xnu-6153.81.5~1/RELEASE_X86_64 x86_64
Description of the problem including expected versus actual behavior:
The playbook does produce the default configuration of Metricbeat v7 using the ansible-playbook. On Ubuntu, the default is this with the comments removed
root@redacted:/# grep -v "^\s*#" /etc/metricbeat/metricbeat.yml | grep -v "^\s*$"
metricbeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.settings:
index.number_of_shards: 1
index.codec: best_compression
setup.kibana:
output.elasticsearch:
hosts: ["localhost:9200"]
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
If one uses a playbook such as the one listed below, one can get close but the processors are incorrect
root@redacted:/# cat /etc/metricbeat/metricbeat.yml
# Ansible managed
################### metricbeat Configuration #########################
############################# metricbeat ######################################
metricbeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
processors:
- add_host_metadata: null
- add_cloud_metadata: null
- add_docker_metadata: null
- add_kubernetes_metadata: null
setup.template.settings:
index.codec: best_compression
index.number_of_shards: 1
###############################################################################
############################# Libbeat Config ##################################
# Base config file used by all other beats for using libbeat features
############################# Output ##########################################
output:
elasticsearch:
hosts:
- localhost:8200
############################# Logging #########################################
logging:
files:
rotateeverybytes: 10485760
This configuration will run Metricbeat but the metadata about the servers will be missing from the records so the Metrics UI in Kibana does not work properly.
Single quoting ('
) the ~
produces a config file that includes
processors:
- add_host_metadata: '~'
- add_cloud_metadata: '~'
- add_docker_metadata: '~'
- add_kubernetes_metadata: '~'
which causes Metricbeat to not run because it throws an error about string is not an object.
RUNNING HANDLER [elastic.beats : restart the service] **************************
fatal: [7a2b85391a72]: FAILED! => {"changed": false, "msg": "Failed to restart service: metricbeat", "rc": 1, "stderr": "2020-03-02T23:05:43.532Z\tINFO\tinstance/beat.go:622\tHome path: [/usr/share/metricbeat/bin] Config path: [/usr/share/metricbeat/bin] Data path: [/usr/share/metricbeat/bin/data] Logs path: [/usr/share/metricbeat/bin/logs]\n2020-03-02T23:05:43.532Z\tINFO\tinstance/beat.go:630\tBeat ID: 05094e0e-7db0-4357-a1d0-f6a40f08eda8\n2020-03-02T23:05:43.532Z\tERROR\tinstance/beat.go:933\tExiting: error initializing processors: can not convert 'string' into 'object' accessing 'processors.0.add_host_metadata' (source:'/etc/metricbeat/metricbeat.yml')\nExiting: error initializing processors: can not convert 'string' into 'object' accessing 'processors.0.add_host_metadata' (source:'/etc/metricbeat/metricbeat.yml')\n",
"stderr_lines": ["2020-03-02T23:05:43.532Z\tINFO\tinstance/beat.go:622\tHome path: [/usr/share/metricbeat/bin] Config path: [/usr/share/metricbeat/bin] Data path: [/usr/share/metricbeat/bin/data] Logs path: [/usr/share/metricbeat/bin/logs]",
"2020-03-02T23:05:43.532Z\tINFO\tinstance/beat.go:630\tBeat ID: 05094e0e-7db0-4357-a1d0-f6a40f08eda8",
"2020-03-02T23:05:43.532Z\tERROR\tinstance/beat.go:933\tExiting: error initializing processors: can not convert 'string' into 'object' accessing 'processors.0.add_host_metadata' (source:'/etc/metricbeat/metricbeat.yml')",
"Exiting: error initializing processors: can not convert 'string' into 'object' accessing 'processors.0.add_host_metadata' (source:'/etc/metricbeat/metricbeat.yml')"],
"stdout": " ...fail!\n", "stdout_lines": [" ...fail!"]}
Double quoting ("
) the ~
produces a config that includes
processors:
- add_host_metadata: '~'
- add_cloud_metadata: '~'
- add_docker_metadata: '~'
- add_kubernetes_metadata: '~'
TASK [elastic.beats : Start metricbeat service] ********************************
fatal: [7a2b85391a72]: FAILED! => {"changed": false, "msg": "Failed to start service: metricbeat", "rc": 1, "stderr": "2020-03-02T23:09:51.996Z\tINFO\tinstance/beat.go:622\tHome path: [/usr/share/metricbeat/bin] Config path: [/usr/share/metricbeat/bin] Data path: [/usr/share/metricbeat/bin/data] Logs path: [/usr/share/metricbeat/bin/logs]\n2020-03-02T23:09:51.996Z\tINFO\tinstance/beat.go:630\tBeat ID: 05094e0e-7db0-4357-a1d0-f6a40f08eda8\n2020-03-02T23:09:51.996Z\tERROR\tinstance/beat.go:933\tExiting: error initializing processors: can not convert 'string' into 'object' accessing 'processors.0.add_host_metadata' (source:'/etc/metricbeat/metricbeat.yml')\nExiting: error initializing processors: can not convert 'string' into 'object' accessing 'processors.0.add_host_metadata' (source:'/etc/metricbeat/metricbeat.yml')\n",
"stderr_lines": ["2020-03-02T23:09:51.996Z\tINFO\tinstance/beat.go:622\tHome path: [/usr/share/metricbeat/bin] Config path: [/usr/share/metricbeat/bin] Data path: [/usr/share/metricbeat/bin/data] Logs path: [/usr/share/metricbeat/bin/logs]",
"2020-03-02T23:09:51.996Z\tINFO\tinstance/beat.go:630\tBeat ID: 05094e0e-7db0-4357-a1d0-f6a40f08eda8",
"2020-03-02T23:09:51.996Z\tERROR\tinstance/beat.go:933\tExiting: error initializing processors: can not convert 'string' into 'object' accessing 'processors.0.add_host_metadata' (source:'/etc/metricbeat/metricbeat.yml')",
"Exiting: error initializing processors: can not convert 'string' into 'object' accessing 'processors.0.add_host_metadata' (source:'/etc/metricbeat/metricbeat.yml')"],
"stdout": " ...fail!\n", "stdout_lines": [" ...fail!"]}
Playbook:
- name: Install and configure Beats
hosts: all
tasks:
- name: 'Install Metricbeat'
include_role:
name: elastic.beats
vars:
beat: metricbeat
beat_conf:
metricbeat.config.modules:
path: '${path.config}/modules.d/*.yml'
reload.enabled: false
setup.template.settings:
index.number_of_shards: 1
index.codec: best_compression
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
output_conf:
elasticsearch:
hosts: ['localhost:8200']
Provide logs from Ansible:
Beats logs if relevant:
Here is what I've established so far.
First, I looked up in the YAML v1.3 spec that the ~
character indicates null.
From 10.3.2. Tag Resolution:
Regular expression Resolved to tag null | Null | NULL | ~ tag:yaml.org,2002:null
Next, I reviewed the documentation for the add_host_metadata
processor. I observed that all of the parameters on the processor are optional.
Therefore, when ~
is used in the configuration file, the processor is enabled using the default settings.
So I've retried the ~
in the Ansible file again, and it does appear that the metadata is populated on the records
processors:
- add_host_metadata: null
- add_cloud_metadata: null
- add_docker_metadata: null
- add_kubernetes_metadata: null
So I'm confused. 😕
This does appear to work as well (sending an empty map by {}
)
processors:
- add_host_metadata: {}
- add_cloud_metadata: {}
- add_docker_metadata: {}
- add_kubernetes_metadata: {}
😕 I am also not certain where the undesirable behavior is coming from any more (Ansible vs. Metricbeat).
Metricbeat uses gopkg.in/yaml.v2
to parse the metricbeat.yml
file, if I am reading the code correctly.
There are a number of issues (both open and closed) related to the underlying Go library used by Metricbeat to handle parsing the YAML configuration. So that could be an avenue worth investigating.
Hi @a03nikki,
Indeed ~
is translated as null
value by yaml which both Ansible and Beats are using, so /etc/metricbeat/metricbeat.yml
will be set with add_XXX_metadata: null
.
However this configuration seem working as I'm able to retrieve cloud metadata on a GCP instance:
- playbook:
- hosts: localhost
roles:
- role: elastic.elasticsearch
- role: elastic.beats
beat: metricbeat
beat_conf:
metricbeat.config.modules:
path: '${path.config}/modules.d/*.yml'
reload.enabled: false
setup.template.settings:
index.number_of_shards: 1
index.codec: best_compression
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
- generated
/etc/metricbeat/metricbeat.yml
:
# Ansible managed
################### metricbeat Configuration #########################
############################# metricbeat ######################################
metricbeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
processors:
- add_host_metadata: null
- add_cloud_metadata: null
- add_docker_metadata: null
- add_kubernetes_metadata: null
setup.template.settings:
index.codec: best_compression
index.number_of_shards: 1
###############################################################################
############################# Libbeat Config ##################################
# Base config file used by all other beats for using libbeat features
############################# Output ##########################################
output:
elasticsearch:
hosts:
- localhost:9200
############################# Logging #########################################
logging:
files:
rotateeverybytes: 10485760
- Metricbeat logs:
Mar 04 14:53:20 jmlrt-test systemd[1]: Started Metricbeat is a lightweight shipper for metrics..
...
Mar 04 14:53:21 jmlrt-test metricbeat[29405]: 2020-03-04T14:53:21.034Z INFO add_cloud_metadata/add_cloud_metadata.go:93 add_cloud_metadata: hosting provider type detected as gcp, metadata={"availability_zone":"europe-west1-b","instance":{"id":"1000121967836269311","name":"jmlrt-test"},"
...
Mar 04 14:53:21 jmlrt-test metricbeat[29405]: 2020-03-04T14:53:21.050Z INFO instance/beat.go:439 metricbeat start running.
- Example of record in Elasticsearch:
{
"_index" : "metricbeat-7.6.0-2020.03.04-000001",
"_type" : "_doc",
"_id" : "39kVpnAB1zu53QkUCx9J",
"_version" : 1,
"_seq_no" : 9,
"_primary_term" : 1,
"found" : true,
"_source" : {
"@timestamp" : "2020-03-04T15:05:58.970Z",
"system" : {
"filesystem" : {
"used" : {
"pct" : 0.0343,
"bytes" : 3756032
},
"device_name" : "/dev/sda15",
"mount_point" : "/boot/efi",
"type" : "vfat",
"total" : 109422592,
"free" : 105666560,
"available" : 105666560,
"files" : 0,
"free_files" : 0
}
},
"ecs" : {
"version" : "1.4.0"
},
"host" : {
"hostname" : "jmlrt-test",
"architecture" : "x86_64",
"os" : {
"name" : "Ubuntu",
"kernel" : "5.0.0-1031-gcp",
"codename" : "bionic",
"platform" : "ubuntu",
"version" : "18.04.4 LTS (Bionic Beaver)",
"family" : "debian"
},
"name" : "jmlrt-test",
"id" : "6c148824a8b9c66a44c3da2ce1402ec1",
"containerized" : false
},
"agent" : {
"version" : "7.6.0",
"type" : "metricbeat",
"ephemeral_id" : "d96c180a-c174-47de-bb3e-336ff82e6dad",
"hostname" : "jmlrt-test",
"id" : "07b0324a-f0fc-43a6-8289-6e5f5c27dd45"
},
"cloud" : {
"project" : {
"id" : "elastic-infra"
},
"provider" : "gcp",
"instance" : {
"name" : "jmlrt-test",
"id" : "1000121967836269311"
},
"machine" : {
"type" : "n2-standard-2"
},
"availability_zone" : "europe-west1-b"
},
"event" : {
"dataset" : "system.filesystem",
"module" : "system",
"duration" : 1589087
},
"metricset" : {
"name" : "filesystem",
"period" : 60000
},
"service" : {
"type" : "system"
}
}
}
@jmlrt : I am glad it is working for you too. I think it is sufficient to say that there is not a bug in Ansible or Metricbeat.
So maybe the best option is to document a solution in the README.md so other people don't also have to struggle with this?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had recent activity since being marked as stale.