influxdata/helm-charts

telegraf-ds helm chart doesn't work with k3s clusters - no docker.sock

ferdinandosimonetti opened this issue · 2 comments

Hello, I've tried to install Telegraf as a DaemonSet on my (Hetzner-backed) K3S cluster, with this values.yml

USER-SUPPLIED VALUES:
config.docker_endpoint: ""
config.outputs:
- influxdb_v2:
    bucket: default
    organization: influxdata
    token: BLABLABLA==
    urls:
    - http://influx2.monitoring.svc

Kubernetes version/platform

[test0|monitoring] ferdi@DESKTOP-NL6I2OD:~/another-test$ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.1", GitCommit:"3ddd0f45aa91e2f30c70734b175631bec5b5825a", GitTreeState:"clean", BuildDate:"2022-05-24T12:26:19Z", GoVersion:"go1.18.2", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.9+k3s1", GitCommit:"8b0b50a5e88214f69e4ef35ebfd60c6adac9735f", GitTreeState:"clean", BuildDate:"2022-04-28T22:46:07Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.24) and server (1.22) exceeds the supported minor version skew of +/-1

but I'm still facing myTelegraf pods stuck in ContainerCreating

[test0|monitoring] ferdi@DESKTOP-NL6I2OD:~/another-test$ kubectl get po
NAME                            READY   STATUS              RESTARTS   AGE
ds-telegraf-telegraf-ds-wr4sx   0/1     ContainerCreating   0          96m
ds-telegraf-telegraf-ds-xgbd8   0/1     ContainerCreating   0          96m
ds-telegraf-telegraf-ds-zwhd4   0/1     ContainerCreating   0          96m
influx2-influxdb2-0             1/1     Running             0          111m

With these error messages:

  Warning  FailedMount  56m (x4 over 88m)    kubelet  Unable to attach or mount volumes: unmounted volumes=[docker-socket], unattached volumes=[kube-api-access-rqmkt varrunutmpro hostfsro docker-socket config]: timed out waiting for the condition
  Warning  FailedMount  11m (x50 over 97m)   kubelet  MountVolume.SetUp failed for volume "docker-socket" : hostPath type check failed: /var/run/docker.sock is not a socket file
  Warning  FailedMount  6m35s (x4 over 95m)  kubelet  Unable to attach or mount volumes: unmounted volumes=[docker-socket], unattached volumes=[config kube-api-access-rqmkt varrunutmpro hostfsro docker-socket]: timed out waiting for the condition
  Warning  FailedMount  2m2s (x17 over 76m)  kubelet  Unable to attach or mount volumes: unmounted volumes=[docker-socket], unattached volumes=[varrunutmpro hostfsro docker-socket config kube-api-access-rqmkt]: timed out waiting for the condition

_Originally posted by @ferdinandosimonetti in https://github.com/influxdata/helm-charts/issues/452#issuecomment-1251421912_

I've edited the DaemonSet "in place" by removing all references to /var/run/docker.sock and edited accordingly the ConfigMap containing telegraf.conf (that hadn't picked up my references to influxdb_v2).
Also, I've enabled the kube_inventory plugin for better K8S observability, and added the K8S template from here
Now it works as expected.

apiVersion: v1
data:
  telegraf.conf: |2

    [agent]
      collection_jitter = "0s"
      debug = false
      flush_interval = "10s"
      flush_jitter = "0s"
      hostname = "$HOSTNAME"
      interval = "10s"
      logfile = ""
      metric_batch_size = 1000
      metric_buffer_limit = 10000
      omit_hostname = false
      precision = ""
      quiet = false
      round_interval = true

    [[outputs.influxdb_v2]]
      token = "BLAHBLAHBLAH=="
      organization = "influxdata"
      bucket = "default"
      urls = [
        "http://influx2-influxdb2.monitoring.svc:80"
      ]

    [[inputs.diskio]]
    [[inputs.kernel]]
    [[inputs.mem]]
    [[inputs.net]]
    [[inputs.processes]]
    [[inputs.swap]]
    [[inputs.system]]

    [[inputs.cpu]]
    percpu = true
    totalcpu = true
    collect_cpu_time = false
    report_active = false

    [[inputs.disk]]
    ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]

    [[inputs.kubernetes]]
    url = "https://$HOSTIP:10250"
    bearer_token = "/var/run/secrets/kubernetes.io/serviceaccount/token"
    tls_ca = "/run/secrets/kubernetes.io/serviceaccount/ca.crt"
    #insecure_skip_verify = true

    [[inputs.kube_inventory]]
    ## URL for the Kubernetes API
    url = "https://${KUBERNETES_SERVICE_HOST}"
    namespace = ""
    bearer_token = "/run/secrets/kubernetes.io/serviceaccount/token"
    tls_ca = "/run/secrets/kubernetes.io/serviceaccount/ca.crt"
    #insecure_skip_verify = true
kind: ConfigMap
...

Supplied user values do not seem to be correct (or processed correctly). If you use

config:
  outputs:
  - influxdb_v2:
      bucket: default
      organization: influxdata
      token: BLABLABLA==
      urls:
      - http://influx2.monitoring.svc
  docker_endpoint: ""

you will get expected config output without [[inputs.docker]] entry.