Fluentd logs is full of backslash and kibana doesn't show k8s pods logs

Question

Fluentd logs is full of backslash and kibana doesn't show k8s pods logs

avarf opened this issue 5 years ago · 17 comments

Describe the bug
I set up an EFK stack for gathering my different k8s pods logs based on this tutorial: https://mherman.org/blog/logging-in-kubernetes-with-elasticsearch-Kibana-fluentd/ on a Microk8s single node cluster. Everything is up and working and I can connect kibanna to elasticsearch and see the indexes but in the discovery section of kibana there is no log related to my pods and there are kubelete logs.

When I checked the logs of fluentd I saw that it is full of backslashes:

2019-08-05 15:23:17 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: "2019-08-05T17:23:10.167379794+02:00 stdout P 2019-08-05 15:23:10 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-08-05T17:23:07.09726655+02:00 stdout P 2019-08-05 15:23:07 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \\\"2019-08-05T17:23:04.433817307+02:00 stdout P 2019-08-05 15:23:04 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \\\\\\\"2019-08-05T17:22:52.546188522+02:00 stdout P 2019-08-05 15:22:52 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \\\\\\\\\\\\\\\"2019-08-05T17:22:46.694679863+02:00 stdout F \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

There are much more backslashes but I just copied this amount to show the log.

Your Environment

Fluentd or td-agent version: I tested this with two images: fluent/fluentd-kubernetes-daemonset:v1.4-debian-elasticsearch and also v1.3 but the results were the same
Operating system: I am using Ubuntu 18.04 but the fluentd is running in a container and in a single node kubernetes cluster running on Microk8s

Your Configuration
Based on the tutorial that I mentioned earlier I am using two config files for setting up fluentd:

fluentd-rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: fluentd
  namespace: kube-system
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: kube-system

fluentd-daemonset.yaml

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
  # namespace: default
  labels:
    k8s-app: fluentd-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  template:
    metadata:
      labels:
        k8s-app: fluentd-logging
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.4-debian-elasticsearch
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: "elasticsearch.logging"
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          - name: FLUENT_UID
            value: "0"
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

Answer 1 · 2019-08-07T17:36:00.000Z

When I checked the logs of fluentd I saw that it is full of backslashes:

Is this log single line, right? If so, it seems several logs are merged into one.
Do you show the configuration/application example to reproduce the problem?

Answer 2 · 2019-08-08T07:48:22.000Z

No, the log is full of backslashes and there are single lines of actual log and then pages of backslashes but I didn't want to copy all the meaningless backslashes and when I searched for the "error" there wasn't any.
Regarding the configuration, you have all the configuration, I followed that tutorial, used an image and the environment variables that you can see in the yaml files and I ran it on Microk8s and Ubuntu 18.04.

Answer 3 · 2019-11-17T23:49:19.000Z

Any progress on this issue ? I seem to have just hit exactly the same problem

I use a slightly different setup using

image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch

but otherwise substantially the same.

Looking at the logs, it appears to be repeatedly reprocessing the same information, objecting to the format, which generates a new, longer log entry which is then reprocessed .... and around we go.

Answer 4 · 2019-12-09T20:28:24.000Z

I have the same problem after following this tutorial, but using k3s as my kubernetes deployment.

If I strip the backslashs I can see something like:

# kubectl logs --tail=5 fluentd-48jkv -n kube-logging |tr -s "\\"
tr: warning: an unescaped backslash at end of string is not portable
\"
2019-12-09 20:23:29 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: "2019-12-09T20:23:24.66350503Z stdout F \"\"\""
2019-12-09 20:23:29 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: "2019-12-09T20:23:24.664147887Z stdout P 2019-12-09 20:23:24 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-12-09T20:23:21.243596958Z stdout P 2019-12-09 20:23:21 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-12-09T20:23:07.807619666Z stdout P 2019-12-09 20:23:07 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-12-09T20:23:01.152628152Z stdout F \"

But otherwise it's not even possible to see what is going on:

# kubectl logs --tail=5 fluentd-48jkv -n kube-logging |egrep -o '\\'|wc -l
32650

My fluend.yaml is as follows:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: kube-logging
  labels:
    app: fluentd
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
  labels:
    app: fluentd
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: kube-logging
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-logging
  labels:
    app: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: "elasticsearch.kube-logging.svc.cluster.local"
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          - name: FLUENTD_SYSTEMD_CONF
            value: disable
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

Answer 5 · 2020-01-13T10:00:48.000Z

Same issue. Does anyone have a solution for this?

Answer 6 · 2020-01-30T08:24:44.000Z

Same issue \\\\

Answer 7 · 2020-02-23T23:48:28.000Z

If your fluentd logs are growing in backslashes, then your fluentd container is parsing its own logs and recursively generating new logs.

Consider creating a fluentd-config.yaml file that is setup to ignore /var/log/containers/fluentd* logs. My example here will help you parse Apache logs... RTFM for more information on configuring sources.

Here is my fluentd-config.yaml file:

kind: ConfigMap
apiVersion: v1
metadata:
  name: fluentd-config
  namespace: kube-logging
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
data:
  containers.input.conf: |-
    <source>
      @type tail
      @id in_tail_container_logs
      path /var/log/containers/*.log
      exclude_path ["/var/log/containers/fluentd*"]
      pos_file /var/log/fluentd-containers.log.pos
      tag kubernetes.*
      read_from_head true
      format /^.* (?<source>(stderr|stdout))\ F\ (?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/
      time_format %d/%b/%Y:%H:%M:%S %z
    </source>
  output.conf: |-
    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      type kubernetes_metadata
    </filter>
    <match **>
       type elasticsearch
       log_level info
       include_tag_key true
       host elasticsearch.kube-logging.svc.cluster.local
       port 9200
       logstash_format true
       # Set the chunk limits.
       buffer_chunk_limit 2M
       buffer_queue_limit 8
       flush_interval 5s
       # Never wait longer than 5 minutes between retries.
       max_retry_wait 30
       # Disable the limit on the number of retries (retry forever).
       disable_retry_limit
       # Use multiple threads for processing.
       num_threads 2
    </match>

Then you will want to update your fluentd DaemonSet. I have had success with the gcr.io/google-containers/fluentd-elasticsearch:v2.0.1 image. Attach your fluentd-config to your fluentd DaemonSet.

Here's what that looks like:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-logging
  labels:
    app: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: gcr.io/google-containers/fluentd-elasticsearch:v2.0.1
        env:
          - name: FLUENTD_SYSTEMD_CONF
            value: "disable"
          - name: FLUENTD_ARGS
            value: "--no-supervisor -q"
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlogcontainers
          mountPath: /var/log/containers
          readOnly: true
        - name: config
          mountPath: /etc/fluent/config.d
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlogcontainers
        hostPath:
          path: /var/log/containers/
      - name: config
        configMap:
          name: fluentd-config

Best of luck!

Answer 8 · 2020-03-12T08:05:43.000Z

Just added new envvar to fluentd-kubernetes-daemonset for this case:

https://github.com/fluent/fluentd-kubernetes-daemonset#use-fluent_container_tail_exclude_path-to-exclude-specific-container-logs

Answer 9 · 2020-10-20T14:40:03.000Z

I see 2 possible concurrent causes:

You're not excluding fluentd logs (hence the numerous '\' and the circular log messages)
k3s will prefix the log lines with datetime, stream (stdout, stderr) and a log tag. So, if your message is "hello Dolly", k3s will save it to the file as:
2020-10-20T18:05:39.163671864-05:00 stdout F "hello Dolly"

The pattern not match explains why kibana doesn't see any error message. They're not being sent to your elastic service.

Having a proper filter/parser would help on this.
Can you post your fluentd conf?

Answer 10 · 2020-11-11T19:46:26.000Z

Is there a good way for fluentd's own logs to be shipped if possible?

Answer 11 · 2020-12-17T14:57:20.000Z

I got this issue as well, because I was using containerd instead of docker. I solved it by putting in the following configuration:

- name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
  value: /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/

Answer 12 · 2021-04-11T15:32:57.000Z

@micktg
Your solution fixed my problem! Much appreciation!

Answer 13 · 2021-04-12T19:09:27.000Z

For lastest images, use cri parser is better than regexp: https://github.com/fluent/fluentd-kubernetes-daemonset#use-cri-parser-for-containerdcri-o-logs

Answer 14 · 2021-08-06T07:43:45.000Z

I followed a digital ocean tutorial https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes to setup my EFK for kubernetes and faced the same issue. The above answer by @micktg resolved the issue. I added the below in environment variables of my fluentd yaml file, so now my environment variables look like this

    env:
      - name:  FLUENT_ELASTICSEARCH_HOST
        value: "elasticsearch.kube-logging.svc.cluster.local"
      - name:  FLUENT_ELASTICSEARCH_PORT
        value: "9200"
      - name: FLUENT_ELASTICSEARCH_SCHEME
        value: "http"
      - name: FLUENTD_SYSTEMD_CONF
        value: disable
      - name: FLUENT_CONTAINER_TAIL_EXCLUDE_PATH
        value: /var/log/containers/fluent*
      - name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
        value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/

Answer 15 · 2021-08-13T00:05:58.000Z

I found @micktg and @varungupta19 answer solve the problem.

Answer 16 · 2021-10-12T15:28:39.000Z

Thanks, @micktg and @varungupta19. Problem solved.

Answer 17 · 2024-04-06T09:57:25.000Z

I followed a digital ocean tutorial https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes to setup my EFK for kubernetes and faced the same issue. The above answer by @micktg resolved the issue. I added the below in environment variables of my fluentd yaml file, so now my environment variables look like this
    env:
      - name:  FLUENT_ELASTICSEARCH_HOST
        value: "elasticsearch.kube-logging.svc.cluster.local"
      - name:  FLUENT_ELASTICSEARCH_PORT
        value: "9200"
      - name: FLUENT_ELASTICSEARCH_SCHEME
        value: "http"
      - name: FLUENTD_SYSTEMD_CONF
        value: disable
      - name: FLUENT_CONTAINER_TAIL_EXCLUDE_PATH
        value: /var/log/containers/fluent*
      - name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
        value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/ 

adding value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/ this didn't help me. I am trying to implement it in microk8s