grafana/k8s-monitoring-helm

scrape metrics from old prometheus labels

Raboo opened this issue · 5 comments

Hi,

Sorry but this isn't fully a helm chart issue. More of a alloy config issue that I need help with, perhaps I should ask in the alloy project instead. But I'll try here first and if it's wrong place I can close this and open a ticket in the alloy repo.

I'm trying get alloy to scrape the old prometheus annotations that a lot of projects still include in their setups.
i.e.

  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "9091"
    prometheus.io/path: "/metrics"

I know that service and pod monitors are preferred. But these annotations are still used in so many helm charts when "enabling monitoring/prometheus".

So I tried to get this helm chart to configure alloy to scrape those metrics.
So I wrote some extraRelabelingRules.

But that didn't help. Are my rules wrong, or do I need to do anything additional to make this work?

Here is the relevant parts of my values.

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
  - ns.yaml

helmCharts:
  - name: k8s-monitoring
    repo: https://grafana.github.io/helm-charts
    version: 1.1.0
    releaseName: k8s-monitoring
    namespace: k8s-monitoring
    valuesInline:
      metrics:
        autoDiscover:
          extraRelabelingRules: |
            rule {
              source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_scrape_slow"]
              regex = "true"
              action = "keep"
            }
            rule {
              source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_scrape_slow"]
              regex = "true"
              action = "keep"
            }
            rule {
              source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_scrape"]
              regex = "true"
              action = "keep"
            }
            rule {
              source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_scrape"]
              regex = "true"
              action = "keep"
            }
            rule {
              source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_path"]
              action = "replace"
              target_label = "__metrics_path__"
            }
            rule {
              source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_path"]
              action = "replace"
              target_label = "__metrics_path__"
            }
            rule {
              source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_port"]
              regex = "(.+)"
              target_label = "__tmp_port"
            }
            rule {
              source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_port"]
              regex = "(.+)"
              target_label = "__tmp_port"
            }
            rule {
              source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_port", "__meta_kubernetes_pod_ip"]
              regex = "(\\d+);((([0-9]+?)(\\.|$)){4})"
              replacement = "$2:$1"
              target_label = "__address__"
            }
            rule {
              source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_scheme"]
              action = "replace"
              target_label = "__scheme__"
            }
            rule {
              source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_scheme"]
              action = "replace"
              target_label = "__scheme__"
            }
................... snip
skl commented

Hmm... I'm not sure why that didn't work. Maybe you could try the following values instead? It would take advantage of some of the existing annotation template and values in the chart:

metrics:
  autoDiscover:
    extraRelabelingRules: |
      rule {
        source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_scrape_slow"]
        regex = "true"
        action = "keep"
      }
      rule {
        source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_scrape_slow"]
        regex = "true"
        action = "keep"
      }
    annotations:
      scrape: "prometheus.io/scrape"
      metricsPath: "prometheus.io/path"
      metricsPortNumber: "prometheus.io/port"
      metricsScheme: "prometheus.io/scheme"

That should at least work for the prometheus.io/scrape case, I'm not 100% convinced it will for the additional prometheus.io/scrape-slow discovery.

If it doesn't work, it could be that the initial keep rule in the template takes presedence (which would explain why your initial config didn't work), @petewall might know more about that.

If that's the case, we might need to add a new value to allow additional scrape annotations to be discovered.

@skl I still want to be able to scrape pods and services with the k8s.grafana.com/** annotations.

If I put

metrics:
  autoDiscover:
    annotations:
      scrape: "prometheus.io/scrape"
      metricsPath: "prometheus.io/path"
      metricsPortNumber: "prometheus.io/port"
      metricsScheme: "prometheus.io/scheme"

Won't the k8s.grafana.com annotations stop working?

skl commented

@Raboo ah ok, yeah that would stop the k8s.grafana.com annotation scraping - I didn't realise you wanted both. Ok, I'll wait for @petewall to weigh in.

If you want to keep both, you can utilize the extraConfig section to deploy the raw config that'll scrape the prometheus annotations:

    discovery.relabel "prometheus_annotations" {
      targets = discovery.kubernetes.pods.targets
      rule {
        source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_scrape"]
        regex = "(true)"
        action = "keep"
      }

      rule {
        source_labels = ["__address__", "__meta_kubernetes_pod_annotation_prometheus_io_port"]
        target_label = "__address__"
        regex = "([^:]+)(?::\\d+)?;(\\d+)"
        replacement = "$1:$2"
      }

      rule {
        source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_path"]
        target_label = "__metrics_path__"
        regex = "(.*)"
        replacement = "$1"
      }

      rule {
        source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_scrape_interval"]
        target_label = "__scrape_interval__"
        regex = "(.*)"
        replacement = "$1"
      }

      rule {
        source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_scrape_timeout"]
        target_label = "__scrape_timeout__"
        regex = "(.*)"
        replacement = "$1"
      }

      rule {
        source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_job"]
        target_label = "job"
      }
    }

    // Scrape the prometheus annotation workloads
    prometheus.scrape "pods" {
      targets    = discovery.relabel.prometheus_annotations.output
      forward_to = [prometheus.relabel.prometheus_annotations.receiver]
      clustering {
        enabled = true
      }
      scrape_interval = "60s"
    }

    prometheus.relabel "prometheus_annotations" {
      rule {
        source_labels = ["__name__"]
        regex = "^(metric_to_drop|another_metric_to_drop)$"
        action = "drop"
      }
      forward_to = [prometheus.relabel.metrics_service.receiver]
    }

@petewall ok thanks

So this is what I ended up with.
I don't think prometheus annotations have a scrape_interval and scrape_timeout. But I kept them anyways.

extraConfig: |
  // pods
  discovery.relabel "prometheus_annotations_pods" {
    targets = discovery.kubernetes.pods.targets
    rule {
      source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_scrape"]
      regex = "(true)"
      action = "keep"
    }

    rule {
      source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_port", "__meta_kubernetes_pod_ip"]
      regex = "(\\d+);((([0-9]+?)(\\.|$)){4})"
      replacement = "$2:$1"
      target_label = "__address__"
    }

    rule {
      source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_path"]
      target_label = "__metrics_path__"
      regex = "(.*)"
      replacement = "$1"
    }

    rule {
      source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_scrape_interval"]
      target_label = "__scrape_interval__"
      regex = "(.*)"
      replacement = "$1"
    }

    rule {
      source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_scrape_timeout"]
      target_label = "__scrape_timeout__"
      regex = "(.*)"
      replacement = "$1"
    }

    rule {
      source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_job"]
      target_label = "job"
    }

    rule {
      source_labels = ["__meta_kubernetes_pod_container_port_name"]
      target_label = "__tmp_port"
    }

    rule {
      source_labels = ["__meta_kubernetes_pod_annotation_prometheus_io_scheme"]
      action = "replace"
      target_label = "__scheme__"
    }
  }

  // services
  discovery.relabel "prometheus_annotations_services" {
    targets = discovery.kubernetes.services.targets
    rule {
      source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_scrape"]
      regex = "(true)"
      action = "keep"
    }

    rule {
      source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_path"]
      target_label = "__metrics_path__"
      regex = "(.*)"
      replacement = "$1"
    }

    rule {
      source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_scrape_interval"]
      target_label = "__scrape_interval__"
      regex = "(.*)"
      replacement = "$1"
    }

    rule {
      source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_scrape_timeout"]
      target_label = "__scrape_timeout__"
      regex = "(.*)"
      replacement = "$1"
    }

    rule {
      source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_job"]
      target_label = "job"
    }

    // Choose the service port
    rule {
      source_labels = ["__meta_kubernetes_service_port_name"]
      target_label = "__tmp_port"
    }

    rule {
      source_labels = ["__meta_kubernetes_service_port_number"]
      target_label = "__tmp_port"
    }
    rule {
      source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_port"]
      regex = "(.+)"
      target_label = "__tmp_port"
    }
    rule {
      source_labels = ["__meta_kubernetes_service_port_number"]
      action = "keepequal"
      target_label = "__tmp_port"
    }

    rule {
      source_labels = ["__meta_kubernetes_service_annotation_prometheus_io_scheme"]
      action = "replace"
      target_label = "__scheme__"
    }
  }



  discovery.relabel "prometheus_annotations_http" {
    targets = concat(discovery.relabel.prometheus_annotations_pods.output, discovery.relabel.prometheus_annotations_services.output)
    rule {
      source_labels = ["__scheme__"]
      regex = "https"
      action = "drop"
    }
  }

  discovery.relabel "prometheus_annotations_https" {
    targets = concat(discovery.relabel.prometheus_annotations_pods.output, discovery.relabel.prometheus_annotations_services.output)
    rule {
      source_labels = ["__scheme__"]
      regex = "https"
      action = "keep"
    }
  }

  prometheus.scrape "prometheus_annotations_http" {
    targets = discovery.relabel.prometheus_annotations_http.output
    honor_labels = true
    clustering {
      enabled = true
    }
    forward_to = [prometheus.relabel.prometheus_annotations.receiver]
  }

  prometheus.scrape "prometheus_annotations_https" {
    targets = discovery.relabel.prometheus_annotations_https.output
    honor_labels = true
    bearer_token_file = "/var/run/secrets/kubernetes.io/serviceaccount/token"
    tls_config {
      insecure_skip_verify = true
    }
    clustering {
      enabled = true
    }
    forward_to = [prometheus.relabel.prometheus_annotations.receiver]
  }

  prometheus.relabel "prometheus_annotations" {
    max_cache_size = 100000
    forward_to = [prometheus.relabel.metrics_service.receiver]
  }