Fluentd logs is full of backslash and kibana doesn't show k8s pods logs
avarf opened this issue ยท 17 comments
Describe the bug
I set up an EFK stack for gathering my different k8s pods logs based on this tutorial: https://mherman.org/blog/logging-in-kubernetes-with-elasticsearch-Kibana-fluentd/ on a Microk8s single node cluster. Everything is up and working and I can connect kibanna to elasticsearch and see the indexes but in the discovery section of kibana there is no log related to my pods and there are kubelete logs.
When I checked the logs of fluentd I saw that it is full of backslashes:
2019-08-05 15:23:17 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: "2019-08-05T17:23:10.167379794+02:00 stdout P 2019-08-05 15:23:10 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-08-05T17:23:07.09726655+02:00 stdout P 2019-08-05 15:23:07 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \\\"2019-08-05T17:23:04.433817307+02:00 stdout P 2019-08-05 15:23:04 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \\\\\\\"2019-08-05T17:22:52.546188522+02:00 stdout P 2019-08-05 15:22:52 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \\\\\\\\\\\\\\\"2019-08-05T17:22:46.694679863+02:00 stdout F \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
There are much more backslashes but I just copied this amount to show the log.
Your Environment
- Fluentd or td-agent version: I tested this with two images:
fluent/fluentd-kubernetes-daemonset:v1.4-debian-elasticsearch
and alsov1.3
but the results were the same - Operating system: I am using Ubuntu 18.04 but the fluentd is running in a container and in a single node kubernetes cluster running on Microk8s
Your Configuration
Based on the tutorial that I mentioned earlier I am using two config files for setting up fluentd:
- fluentd-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: fluentd
namespace: kube-system
rules:
- apiGroups:
- ""
resources:
- pods
- namespaces
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: fluentd
roleRef:
kind: ClusterRole
name: fluentd
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: fluentd
namespace: kube-system
- fluentd-daemonset.yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-system
# namespace: default
labels:
k8s-app: fluentd-logging
version: v1
kubernetes.io/cluster-service: "true"
spec:
template:
metadata:
labels:
k8s-app: fluentd-logging
version: v1
kubernetes.io/cluster-service: "true"
spec:
serviceAccount: fluentd
serviceAccountName: fluentd
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.4-debian-elasticsearch
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch.logging"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
- name: FLUENT_ELASTICSEARCH_SCHEME
value: "http"
- name: FLUENT_UID
value: "0"
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
When I checked the logs of fluentd I saw that it is full of backslashes:
Is this log single line, right? If so, it seems several logs are merged into one.
Do you show the configuration/application example to reproduce the problem?
No, the log is full of backslashes and there are single lines of actual log and then pages of backslashes but I didn't want to copy all the meaningless backslashes and when I searched for the "error" there wasn't any.
Regarding the configuration, you have all the configuration, I followed that tutorial, used an image and the environment variables that you can see in the yaml files and I ran it on Microk8s and Ubuntu 18.04.
Any progress on this issue ? I seem to have just hit exactly the same problem
I use a slightly different setup using
image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
but otherwise substantially the same.
Looking at the logs, it appears to be repeatedly reprocessing the same information, objecting to the format, which generates a new, longer log entry which is then reprocessed .... and around we go.
I have the same problem after following this tutorial, but using k3s as my kubernetes deployment.
If I strip the backslashs I can see something like:
# kubectl logs --tail=5 fluentd-48jkv -n kube-logging |tr -s "\\"
tr: warning: an unescaped backslash at end of string is not portable
\"
2019-12-09 20:23:29 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: "2019-12-09T20:23:24.66350503Z stdout F \"\"\""
2019-12-09 20:23:29 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: "2019-12-09T20:23:24.664147887Z stdout P 2019-12-09 20:23:24 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-12-09T20:23:21.243596958Z stdout P 2019-12-09 20:23:21 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-12-09T20:23:07.807619666Z stdout P 2019-12-09 20:23:07 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-12-09T20:23:01.152628152Z stdout F \"
But otherwise it's not even possible to see what is going on:
# kubectl logs --tail=5 fluentd-48jkv -n kube-logging |egrep -o '\\'|wc -l
32650
My fluend.yaml is as follows:
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
namespace: kube-logging
labels:
app: fluentd
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluentd
labels:
app: fluentd
rules:
- apiGroups:
- ""
resources:
- pods
- namespaces
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: fluentd
roleRef:
kind: ClusterRole
name: fluentd
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: fluentd
namespace: kube-logging
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-logging
labels:
app: fluentd
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
serviceAccount: fluentd
serviceAccountName: fluentd
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch.kube-logging.svc.cluster.local"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
- name: FLUENT_ELASTICSEARCH_SCHEME
value: "http"
- name: FLUENTD_SYSTEMD_CONF
value: disable
resources:
limits:
memory: 512Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
Same issue. Does anyone have a solution for this?
Same issue \\\\
If your fluentd logs are growing in backslashes, then your fluentd container is parsing its own logs and recursively generating new logs.
Consider creating a fluentd-config.yaml file that is setup to ignore /var/log/containers/fluentd*
logs. My example here will help you parse Apache logs... RTFM for more information on configuring sources.
Here is my fluentd-config.yaml
file:
kind: ConfigMap
apiVersion: v1
metadata:
name: fluentd-config
namespace: kube-logging
labels:
addonmanager.kubernetes.io/mode: Reconcile
data:
containers.input.conf: |-
<source>
@type tail
@id in_tail_container_logs
path /var/log/containers/*.log
exclude_path ["/var/log/containers/fluentd*"]
pos_file /var/log/fluentd-containers.log.pos
tag kubernetes.*
read_from_head true
format /^.* (?<source>(stderr|stdout))\ F\ (?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/
time_format %d/%b/%Y:%H:%M:%S %z
</source>
output.conf: |-
# Enriches records with Kubernetes metadata
<filter kubernetes.**>
type kubernetes_metadata
</filter>
<match **>
type elasticsearch
log_level info
include_tag_key true
host elasticsearch.kube-logging.svc.cluster.local
port 9200
logstash_format true
# Set the chunk limits.
buffer_chunk_limit 2M
buffer_queue_limit 8
flush_interval 5s
# Never wait longer than 5 minutes between retries.
max_retry_wait 30
# Disable the limit on the number of retries (retry forever).
disable_retry_limit
# Use multiple threads for processing.
num_threads 2
</match>
Then you will want to update your fluentd DaemonSet. I have had success with the gcr.io/google-containers/fluentd-elasticsearch:v2.0.1
image. Attach your fluentd-config to your fluentd DaemonSet.
Here's what that looks like:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-logging
labels:
app: fluentd
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
serviceAccount: fluentd
serviceAccountName: fluentd
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: fluentd
image: gcr.io/google-containers/fluentd-elasticsearch:v2.0.1
env:
- name: FLUENTD_SYSTEMD_CONF
value: "disable"
- name: FLUENTD_ARGS
value: "--no-supervisor -q"
resources:
limits:
memory: 512Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlogcontainers
mountPath: /var/log/containers
readOnly: true
- name: config
mountPath: /etc/fluent/config.d
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlogcontainers
hostPath:
path: /var/log/containers/
- name: config
configMap:
name: fluentd-config
Best of luck!
Just added new envvar to fluentd-kubernetes-daemonset for this case:
I see 2 possible concurrent causes:
- You're not excluding fluentd logs (hence the numerous '\' and the circular log messages)
- k3s will prefix the log lines with datetime, stream (stdout, stderr) and a log tag. So, if your message is "hello Dolly", k3s will save it to the file as:
2020-10-20T18:05:39.163671864-05:00 stdout F "hello Dolly"
The pattern not match explains why kibana doesn't see any error message. They're not being sent to your elastic service.
Having a proper filter/parser would help on this.
Can you post your fluentd conf?
Is there a good way for fluentd's own logs to be shipped if possible?
I got this issue as well, because I was using containerd instead of docker. I solved it by putting in the following configuration:
- name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
value: /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
@micktg
Your solution fixed my problem! Much appreciation!
For lastest images, use cri parser is better than regexp: https://github.com/fluent/fluentd-kubernetes-daemonset#use-cri-parser-for-containerdcri-o-logs
I followed a digital ocean tutorial https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes to setup my EFK for kubernetes and faced the same issue. The above answer by @micktg resolved the issue. I added the below in environment variables of my fluentd yaml file, so now my environment variables look like this
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch.kube-logging.svc.cluster.local"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
- name: FLUENT_ELASTICSEARCH_SCHEME
value: "http"
- name: FLUENTD_SYSTEMD_CONF
value: disable
- name: FLUENT_CONTAINER_TAIL_EXCLUDE_PATH
value: /var/log/containers/fluent*
- name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/
I found @micktg and @varungupta19 answer solve the problem.
Thanks, @micktg and @varungupta19. Problem solved.
I followed a digital ocean tutorial https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes to setup my EFK for kubernetes and faced the same issue. The above answer by @micktg resolved the issue. I added the below in environment variables of my fluentd yaml file, so now my environment variables look like this
env: - name: FLUENT_ELASTICSEARCH_HOST value: "elasticsearch.kube-logging.svc.cluster.local" - name: FLUENT_ELASTICSEARCH_PORT value: "9200" - name: FLUENT_ELASTICSEARCH_SCHEME value: "http" - name: FLUENTD_SYSTEMD_CONF value: disable - name: FLUENT_CONTAINER_TAIL_EXCLUDE_PATH value: /var/log/containers/fluent* - name: FLUENT_CONTAINER_TAIL_PARSER_TYPE value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/
adding value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/
this didn't help me. I am trying to implement it in microk8s