[Bug]: Chart filtering to prometheus appears to be broken
johnalotoski opened this issue · 3 comments
Bug description
Netdata should be able to filter charts from promtheus exporter per docs.
However, adding a prometheus exporter filter doesn't seem to do filtering anymore.
Expected behavior
Filtering will filter the prometheus metrics.
Steps to reproduce
Create a netdata.conf to do some filtering, in this example, try to filter everything:
[prometheus:exporter]
send charts matching = !*
Check if any filtering has occurred and see nothing has been filtered:
# curl -s 'localhost:19999/api/v1/allmetrics?format=prometheus' | wc -l
2026
Finer grained filtering also seems to make no difference, example with a pattern of statsd_*
:
[prometheus:exporter]
send charts matching = statsd_*
I would expect to see only charts starting with statsd_
, but similar to above, we see all charts:
# curl -s 'localhost:19999/api/v1/allmetrics?format=prometheus' | wc -l
2026
Installation method
other
System info
Linux sanchonet1-test-a-1 6.1.84 #1-NixOS SMP PREEMPT_DYNAMIC Wed Apr 3 13:19:55 UTC 2024 x86_64 GNU/Linux
/etc/lsb-release:DISTRIB_CODENAME=tapir
/etc/lsb-release:DISTRIB_DESCRIPTION="NixOS 23.11 (Tapir)"
/etc/lsb-release:DISTRIB_ID=nixos
/etc/lsb-release:DISTRIB_RELEASE="23.11"
/etc/lsb-release:LSB_VERSION="23.11 (Tapir)"
/etc/os-release:BUILD_ID="23.11pre-git"
/etc/os-release:ID=nixos
/etc/os-release:LOGO="nix-snowflake"
/etc/os-release:NAME=NixOS
/etc/os-release:PRETTY_NAME="NixOS 23.11 (Tapir)"
/etc/os-release:SUPPORT_END="2024-06-30"
/etc/os-release:VERSION="23.11 (Tapir)"
/etc/os-release:VERSION_CODENAME=tapir
/etc/os-release:VERSION_ID="23.11"
Linux sanchonet1-test-a-1 6.1.84 #1-NixOS SMP PREEMPT_DYNAMIC Wed Apr 3 13:19:55 UTC 2024 x86_64 GNU/Linux
Netdata build info
Packaging:
Netdata Version ____________________________________________ : v1.43.2
Installation Type __________________________________________ : unknown
Package Architecture _______________________________________ : unknown
Package Distro _____________________________________________ : unknown
Configure Options __________________________________________ : REMOVED FOR CLOSURE SIZE REASONS
Default Directories:
User Configurations ________________________________________ : /etc/netdata
Stock Configurations _______________________________________ : /nix/store/r1vk56fkg3dpdbly303ir2hk909naw2h-netdata-1.43.2/lib/netdata/conf.d
Ephemeral Databases (metrics data, metadata) _______________ : /var/cache/netdata
Permanent Databases ________________________________________ : /var/lib/netdata
Plugins ____________________________________________________ : /nix/store/r1vk56fkg3dpdbly303ir2hk909naw2h-netdata-1.43.2/libexec/netdata/plugins.d
Static Web Files ___________________________________________ : /nix/store/r1vk56fkg3dpdbly303ir2hk909naw2h-netdata-1.43.2/share/netdata/web
Log Files __________________________________________________ : /var/log/netdata
Lock Files _________________________________________________ : /var/lib/netdata/lock
Home _______________________________________________________ : /var/lib/netdata
Operating System:
Kernel _____________________________________________________ : Linux
Kernel Version _____________________________________________ : 6.1.84
Operating System ___________________________________________ : NixOS
Operating System ID ________________________________________ : nixos
Operating System ID Like ___________________________________ : unknown
Operating System Version ___________________________________ : 23.11 (Tapir)
Operating System Version ID ________________________________ : none
Detection __________________________________________________ : /etc/os-release
Hardware:
CPU Cores __________________________________________________ : 2
CPU Frequency ______________________________________________ : 2199000000
CPU Architecture ___________________________________________ : 4072284160
RAM Bytes __________________________________________________ : 85899345920
Disk Capacity ______________________________________________ : x86_64
Virtualization Technology __________________________________ : amazon
Virtualization Detection ___________________________________ : systemd-detect-virt
Container:
Container __________________________________________________ : none
Container Detection ________________________________________ : systemd-detect-virt
Container Orchestrator _____________________________________ : none
Container Operating System _________________________________ : none
Container Operating System ID ______________________________ : none
Container Operating System ID Like _________________________ : none
Container Operating System Version _________________________ : none
Container Operating System Version ID ______________________ : none
Container Operating System Detection _______________________ : none
Features:
Built For __________________________________________________ : Linux
Netdata Cloud ______________________________________________ : NO (disabled)
Health (trigger alerts and send notifications) _____________ : YES
Streaming (stream metrics to parent Netdata servers) _______ : YES
Replication (fill the gaps of parent Netdata servers) ______ : YES
Streaming and Replication Compression ______________________ : YES (lz4)
Contexts (index all active and archived metrics) ___________ : YES
Tiering (multiple dbs with different metrics resolution) ___ : YES (5)
Machine Learning ___________________________________________ : YES
Database Engines:
dbengine ___________________________________________________ : YES
alloc ______________________________________________________ : YES
ram ________________________________________________________ : YES
map ________________________________________________________ : YES
save _______________________________________________________ : YES
none _______________________________________________________ : YES
Connectivity Capabilities:
ACLK (Agent-Cloud Link: MQTT over WebSockets over TLS) _____ : NO
static (Netdata internal web server) _______________________ : YES
h2o (web server) ___________________________________________ : NO
WebRTC (experimental) ______________________________________ : NO
Native HTTPS (TLS Support) _________________________________ : YES
TLS Host Verification ______________________________________ : YES
Libraries:
LZ4 (extremely fast lossless compression algorithm) ________ : YES
zlib (lossless data-compression library) ___________________ : YES
Judy (high-performance dynamic arrays and hashtables) ______ : YES (bundled)
dlib (robust machine learning toolkit) _____________________ : YES (bundled)
protobuf (platform-neutral data serialization protocol) ____ : NO
OpenSSL (cryptography) _____________________________________ : YES
libdatachannel (stand-alone WebRTC data channels) __________ : NO
JSON-C (lightweight JSON manipulation) _____________________ : YES
libcap (Linux capabilities system operations) ______________ : YES
libcrypto (cryptographic functions) ________________________ : YES
libm (mathematical functions) ______________________________ : YES
jemalloc ___________________________________________________ : YES
TCMalloc ___________________________________________________ : NO
Plugins:
apps (monitor processes) ___________________________________ : YES
cgroups (monitor containers and VMs) _______________________ : YES
cgroup-network (associate interfaces to CGROUPS) ___________ : YES
proc (monitor Linux systems) _______________________________ : YES
tc (monitor Linux network QoS) _____________________________ : YES
diskspace (monitor Linux mount points) _____________________ : YES
freebsd (monitor FreeBSD systems) __________________________ : NO
macos (monitor MacOS systems) ______________________________ : NO
statsd (collect custom application metrics) ________________ : YES
timex (check system clock synchronization) _________________ : YES
idlejitter (check system latency and jitter) _______________ : YES
bash (support shell data collection jobs - charts.d) _______ : YES
debugfs (kernel debugging metrics) _________________________ : YES
cups (monitor printers and print jobs) _____________________ : NO
ebpf (monitor system calls) ________________________________ : NO
freeipmi (monitor enterprise server H/W) ___________________ : YES
nfacct (gather netfilter accounting) _______________________ : YES
perf (collect kernel performance events) ___________________ : YES
slabinfo (monitor kernel object caching) ___________________ : YES
Xen ________________________________________________________ : NO
Xen VBD Error Tracking _____________________________________ : NO
Exporters:
AWS Kinesis ________________________________________________ : NO
GCP PubSub _________________________________________________ : NO
MongoDB ____________________________________________________ : NO
Prometheus (OpenMetrics) Exporter __________________________ : YES
Prometheus Remote Write ____________________________________ : NO
Graphite ___________________________________________________ : YES
Graphite HTTP / HTTPS ______________________________________ : YES
JSON _______________________________________________________ : YES
JSON HTTP / HTTPS __________________________________________ : YES
OpenTSDB ___________________________________________________ : YES
OpenTSDB HTTP / HTTPS ______________________________________ : YES
All Metrics API ____________________________________________ : YES
Shell (use metrics in shell scripts) _______________________ : YES
Debug/Developer Features:
Trace All Netdata Allocations (with charts) ________________ : NO
Developer Mode (more runtime checks, slower) _______________ : NO
Additional info
There are two options to filter prometheus metrics:
- The first is via config file filtering which appears doesn't work as discussed above
- The second way is by including the filter param directly in the URL, which does work from CLI, example:
# curl -s 'localhost:19999/api/v1/allmetrics?format=prometheus&filter=statsd_*' | wc -l
2
- However, when using automation to scrape this endpoint, various scrape clients, such a grafana-agent, will automatically encode pattern identifiers, such as
*
with%2A
, in the URL with no apparent way to escape these and the scrape also fails using this approach.
May 08 23:16:37 grafana-agent-start[261377]: ts=2024-05-08T23:16:37.608255206Z caller=scrape.go:1384 level=debug agent=prometheus component="scrape manager" target="http://localhost:8125/api/v1/allmetrics?filter=statsd_%2A&format=prometheus" msg="Scrape failed" err="Get \"http://localhost:8125/api/v1/allmetrics?filter=statsd_%2A&format=prometheus\": context deadline exceeded"
- Maybe it would be nice if the URL method would also recognize the encoded form of the pattern matchers so clients which automatically encode special characters won't break.
Hi, @johnalotoski. I can't reproduce the problem in v1.45.4.
$ cat exporting.conf | grep -v "#"
[prometheus:exporter]
send charts matching = !*
$ curl -s 'localhost:19999/api/v1/allmetrics?format=prometheus' | wc -l
1
Have you restarted Netdata after updating exporting.conf
?
Also, you can filter using the filter
URL parameter:
$ curl 'localhost:19999/api/v1/allmetrics?format=prometheus&filter=*system.softirq*'
netdata_info{instance="pve-deb-work",application="netdata",version="v1.45.3-10-gb589731c8"} 1 1715237847219
netdata_system_softirq_latency_milliseconds_average{chart="system.softirq_latency",dimension="HI",family="softirqs"} 0.0000000 1715237830000
netdata_system_softirq_latency_milliseconds_average{chart="system.softirq_latency",dimension="TIMER",family="softirqs"} 0.2000000 1715237830000
netdata_system_softirq_latency_milliseconds_average{chart="system.softirq_latency",dimension="NET_TX",family="softirqs"} 0.0000000 1715237830000
netdata_system_softirq_latency_milliseconds_average{chart="system.softirq_latency",dimension="NET_RX",family="softirqs"} 0.0000000 1715237830000
netdata_system_softirq_latency_milliseconds_average{chart="system.softirq_latency",dimension="BLOCK",family="softirqs"} 0.0000000 1715237830000
netdata_system_softirq_latency_milliseconds_average{chart="system.softirq_latency",dimension="IRQ_POLL",family="softirqs"} 0.0000000 1715237830000
netdata_system_softirq_latency_milliseconds_average{chart="system.softirq_latency",dimension="TASKLET",family="softirqs"} 0.0000000 1715237830000
netdata_system_softirq_latency_milliseconds_average{chart="system.softirq_latency",dimension="SCHED",family="softirqs"} 0.4000001 1715237830000
netdata_system_softirq_latency_milliseconds_average{chart="system.softirq_latency",dimension="HRTIMER",family="softirqs"} 0.0000000 1715237830000
netdata_system_softirq_latency_milliseconds_average{chart="system.softirq_latency",dimension="RCU",family="softirqs"} 0.4000001 1715237830000
netdata_system_softirqs_softirqs_persec_average{chart="system.softirqs",dimension="HI",family="softirqs"} 0.0000000 1715237844000
netdata_system_softirqs_softirqs_persec_average{chart="system.softirqs",dimension="TIMER",family="softirqs"} 53.1620671 1715237844000
netdata_system_softirqs_softirqs_persec_average{chart="system.softirqs",dimension="NET_TX",family="softirqs"} 0.0000000 1715237844000
netdata_system_softirqs_softirqs_persec_average{chart="system.softirqs",dimension="NET_RX",family="softirqs"} 12.3700883 1715237844000
netdata_system_softirqs_softirqs_persec_average{chart="system.softirqs",dimension="BLOCK",family="softirqs"} 0.5258970 1715237844000
netdata_system_softirqs_softirqs_persec_average{chart="system.softirqs",dimension="TASKLET",family="softirqs"} 0.1428571 1715237844000
netdata_system_softirqs_softirqs_persec_average{chart="system.softirqs",dimension="SCHED",family="softirqs"} 69.8122657 1715237844000
netdata_system_softirqs_softirqs_persec_average{chart="system.softirqs",dimension="HRTIMER",family="softirqs"} 0.0000000 1715237844000
netdata_system_softirqs_softirqs_persec_average{chart="system.softirqs",dimension="RCU",family="softirqs"} 137.9602871 1715237844000
$
Hi @ilyam8, thanks for trying to reproduce; I do have this config issue resolved on my side now.
For filtering by URL parameter, with the example you provided, I did address that above in the Additional Info
section -- see the log example there. Some common scrape clients, such as grafana-agent, escape the pattern matchers, such as *
in the URL, so setting params
for filtering through these clients doesn't work because of the automatic URL escaping. I wish it did work there too.