prometheus/blackbox_exporter

Blackbox ICMP probe getting failed for the servers which are pinging.

Gopinath-31893 opened this issue · 2 comments

Host operating system: output of uname -a

We have configured in AKS cluster. Our blackbox is running in a container.

Linux Ubuntu SMP x86_64 GNC/Linux

blackbox_exporter version: output of blackbox_exporter --version

We have configured through helm chart

chart Version : prometheus-blackbox-exporter-8. 6.1
Blackbox version: v0.24.0

What is the blackbox.yml module config.

apiVersion: v1
data:
blackbox.yaml: |
modules:
icmp:
icmp:
preferred_ip_protocol: ip4
prober: icmp
timeout: 90s

What is the prometheus.yml scrape config.

  • job_name: 'job1'
    scrape_interval: 5m
    scrape_timeout: 1m
    static_configs:
    • targets: ['server1.com']
      metric_path: /probe
      params:
      module: [icmp]
      ans we have done relabel_config and pointed to blackbox URL for collecting metrics

What logging output did you get from adding &debug=true to the probe URL?

ts=2024-03-14T15:27:41.754759714Z caller-main.go:181 module=icmp target-server1.com level=info msg="Beginning probe" probe-icmp timeout_seconds=90
ts=2024-03-14T15:27:41.7548245142 caller=icmp.go: 91 module=icmp target-server1.com level=info msg="Resolving target address" target=server1.com ip_protocol=ip4
ts-2024-03-14T15:27:41.83982415z caller=icmp.go:91 module=icmp target-server1.com level=info msg="Resolved target address" target=server1.com ip=88.88.88.88 ts=2024-03-14T15:27:41.83987175Z caller-handler.go:120 module=icmp target=server1.com level=info msg="Creating socket"
ts=2024-03-14T15:27:41.83989815Z caller-handler.go:120 module-icmp target=server1.com level=debug msg="Unable to do unprivileged listen on socket, will attempt privileged" err="socket: permission denied"
ts=2024-03-14T15:27:41.839953051z caller-handler.go:120 module=icmp target-server1.com level=info msg="Creating ICMP packet" seq=30962 id=45667
ts=2024-03-14T15:27:41.839975051Z caller-handler.go:120 module=icmp target-server1.com level=info msg="Writing out packet"
ts=2024-03-14T15:27:41.8399855512 caller-handler.go:120 module=icmp target-server1.com level=debug msg="Setting TTL (IPv4 unprivileged)" ttl=64
ts=2024-03-14T15:27:41.8400799512 caller-handler.go:120 module=icmp target-server1.com level=info msg="Waiting for reply packets"
ts=2024-03-14T15:29:11.755637286Z caller-handler.go:120 module=icmp target=server1.com level=debug msg="Cannot get TTL from the received packet. 'probe_icmp_reply_hop_limit' will be missing."
ts=2024-03-14T15:29:11.755683086Z caller-handler.go:120 module=icmp target=server1.com level=warn msg="Timeout reading from socket" err="read ip 0.0.0.0: raw-read ip4 0.0.0.0:
ts=2024-03-14T15:29:11.755729187Z caller-main.go:181 module=icmp target-server1.com level-error msg="Probe failed" duration_seconds 90.000940273 i/o timeout"

What did you do that produced an error?

The above mentioned error is just an example one which we got in our environment. We are scraping 500+ servers using blackbox exporter. The issue is happening with random servers. Even though the server is UP and pinging. blackbox throughs error like this for many servers.
The initial setup is with default one for timeout, so due to we rae getting this kind of error we set timeout as 90s in blackbox configuration.

What did you expect to see?

Even though the configuration are fine, why it's throwing such error in blackbox.
The manual ping is responding fine even inside the container.
We like to know what configuration I have to change to sort this out.

What did you see instead?