openshift/cluster-logging-operator

Update fluent.conf to collect pod logs that are uncompressed and rotated by kubelet

Closed this issue · 2 comments

Describe the bug

Kubelet rotates latest log without compression [1], so that container can still write and fluentd can finish reading. However openshift fluent.conf is not using this option as our filter is limited to read "/var/log/pods///*.log"

https://github.com/kubernetes/kubernetes/blob/c5cf0ac1889f55ab51749798bec684aed876709d/pkg/kubelet/logs/container_log_manager.go#L413

Environment

  • All versions of fluentd fluent.conf

Logs

Pod logs directory has two files available.

# ls /var/log/pods/*/flog-container/*

-rw-------. 1 root root 419M Sep 23 18:43 /var/log/pods/flog_flog-deployment-d97b4b954-ct276_f4639828-4c79-4785-8ef4-55ab9e4002ea/flog-container/0.log
-rw-------. 1 root root 1.1G Sep 23 17:47 /var/log/pods/flog_flog-deployment-d97b4b954-ct276_f4639828-4c79-4785-8ef4-55ab9e4002ea/flog-container/0.log.20230923-174723

However fluentd is only reading 0.log because of the path filter /var/log/pods/*/*/*.log in fluent.conf

lsof /var/log/pods/*/flog-container/*

fluentd 3223626 root 2128r   REG  252,4 438032950 163578597 /var/log/pods/flog_flog-deployment-d97b4b954-ct276_f4639828-4c79-4785-8ef4-55ab9e4002ea/flog-container/0.log

As fluentd is not using the kubelet tuning to read uncompressed files the logs are getting missed during rotation in high volume environement.

Expected behavior
Fluentd should read all pod logs which are uncompressed.

Actual behavior
Fluentd is not reading the logs which are uncompressed.

To Reproduce
Steps to reproduce the behavior:

  1. Deploy flog container to generate fakelogs
$ cat flog-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: flog-deployment
  namespace: flog
spec:
  replicas: 1
  selector:
    matchLabels:
      app: flog
  template:
    metadata:
      labels:
        app: flog
    spec:
      containers:
      - name: flog-container
        image: quay.io/deployments/flog:latest
        env:
        - name: OPT
          value: "-d1ms -l"

$oc apply -f flog-deploy.yaml
  1. Wait for the log rotation to happen.
  2. Fluentd wont read the uncompressed rotated log file by kubelet. It keeps on reading 0.log

Additional context
This problem is annoying on environments that create high volume and keep missing the application logs

Pull request created
#2176

Fluent in fact does read these logs because it keeps track of logs by their inode. The primary issue for log loss is that fluent is unable to keep up with the load under certain conditions. This is especially true when there are many containers logging high volume with large messages. I suggest using our vector deployment which has better throughput in general