This repository holds code and scripts to benchmark fluentd, fluent-bit and vector log collectors as they are deployed on top of environment.
The content of this repository includes the following components:
- Log load application deployed inside containers that create log stress on OpenShift. (Please refer to cluster-logging-load-client for more information)
- Collectors configuration and deployment files. This includes fluentd, fluent bit and vector configurations
- Benchmark monitoring and statistics component (simple golang app
check-logs-sequence.go
) - Deployment and benchmark management scripts
Note: The benchmark intentionally uses single OpenShift worker node. The benchmark script will choose one of the available nodes and deploy all benchmark components onto that node
Note: to maximize benchmark accuracy, it is highly recommended that the used cluster is not deployed with any additional workloads/containers other than the ones deployed by this benchmark. If required, it is possible to use evacuation configuration command-line parameter to evacuate all none-related pods from the used node, but this is less recommended
- Deployed OCP cluster
- Golang version 1.5.2 or higher
- Login to your cluster (using
oc login
command) - Clone this repository
- Execute
./deploy_to_openshift.sh
For complete list of configuration
options execute: ./deploy_to_openshift.sh --help
For example to benchmark using fluentbit as the log collector
and use heavy configuration profile (detailed profile parameters)
execute
./deploy_to_openshift.sh -c=fluentbit -p=heavy
Another example using fluentd with specific configuration (partial CLO configuration)
./deploy_to_openshift.sh -p=heavy -c=fluentd -fc=conf/collector/fluentd/partial/CLO_no_measure.conf
Note: make sure to use the sign
=
between each command-line key and value
This project is a golang application to generate logs and send them to various output destinations in various formats. The app runs as a single executable and based on configuration it can spawn multiple threads. User can scale the app horizontally for heavy workload.
To use a different load client, you can specify it in the configuration
./deploy_to_openshift.sh --stressorimage=<LINK TO YOUR CONTAINER IMAGE>
Note any vector configuration that utilizes a "kubernetes_log" source must additionally have a transform to modify "file" to "path" as in the default configuration. Capturing statistics will otherwise not measure log information.
Typical deployment of the benchmark components on OpenShift cluster looks like this:
$ oc get pods
NAME READY STATUS RESTARTS AGE
capturestatistics-86cbb9d84d-jgh2d 1/1 Running 0 36s
fluentd-597f957d6b-gmxlb 1/1 Running 0 48s
heavy-log-stress-66dd57fb95-ghjv2 1/1 Running 0 56s
heavy-log-stress-66dd57fb95-kz95g 1/1 Running 0 56s
low-log-stress-6db87fbcbc-j5l8s 1/1 Running 0 56s
low-log-stress-6db87fbcbc-l57sb 1/1 Running 0 56s
low-log-stress-6db87fbcbc-lh5s2 1/1 Running 0 56s
low-log-stress-6db87fbcbc-llj84 1/1 Running 0 55s
low-log-stress-6db87fbcbc-nws4l 1/1 Running 0 56s
low-log-stress-6db87fbcbc-pljv4 1/1 Running 0 56s
low-log-stress-6db87fbcbc-r79vv 1/1 Running 0 56s
low-log-stress-6db87fbcbc-x4w8p 1/1 Running 0 55s
Benchmark results are logged periodically onto capturestatistics pod and include: both
top
information every 120 seconds (e.g. cpu, memory )- Statistical information on logs created by stress containers and log captured by the collector. The information is available periodically and as total from beginning of benchmark. In addition, log loss is calculated
====> Top information on: Mon Mar 15 13:53:14 UTC 2021
top - 13:53:14 up 8:25, 0 users, load average: 0.51, 0.62, 2.22
Tasks: 438 total, 1 running, 437 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.1 us, 1.1 sy, 0.0 ni, 97.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 63605.7 total, 41370.7 free, 5408.4 used, 16826.6 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 58234.2 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1876 root 20 0 3415160 192296 66372 S 12.5 0.3 61:22.61 kubelet
8403 nobody 20 0 739148 36756 22240 S 6.2 0.1 0:04.47 kube-rbac-proxy
8459 nobody 20 0 3380228 2.0g 158176 S 6.2 3.3 61:41.15 prometheus
508900 root 20 0 143732 2704 1976 S 6.2 0.0 0:04.97 conmon
509792 root 20 0 110996 40860 8996 S 6.2 0.1 3:08.95 fluent-bit
532842 root 20 0 7696 4084 3284 R 6.2 0.0 0:00.01 top
1 root 20 0 249324 17652 9036 S 0.0 0.0 3:23.92 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.05 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp
6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H-kblockd
9 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq
10 root 20 0 0 0 0 S 0.0 0.0 0:00.81 ksoftirqd/0
11 root 20 0 0 0 0 I 0.0 0.0 0:12.28 rcu_sched
12 root rt 0 0 0 0 S 0.0 0.0 0:00.03 migration/0
13 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0
15 root 20 0 0 0 0 S 0.0 0.0 0:00.01 cpuhp/1
16 root rt 0 0 0 0 S 0.0 0.0 0:00.02 watchdog/1
17 root rt 0 0 0 0 S 0.0 0.0 0:00.03 migration/1
18 root 20 0 0 0 0 S 0.0 0.0 0:00.48 ksoftirqd/1
20 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/1:0H-kblockd
21 root 20 0 0 0 0 S 0.0 0.0 0:00.01 cpuhp/2
22 root rt 0 0 0 0 S 0.0 0.0 0:00.02 watchdog/2
23 root rt 0 0 0 0 S 0.0 0.0 0:00.02 migration/2
24 root 20 0 0 0 0 S 0.0 0.0 0:00.38 ksoftirqd/2
26 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/2:0H-kblockd
27 root 20 0 0 0 0 S 0.0 0.0 0:00.01 cpuhp/3
28 root rt 0 0 0 0 S 0.0 0.0 0:00.01 watchdog/3
29 root rt 0 0 0 0 S 0.0 0.0 0:00.03 migration/3
30 root 20 0 0 0 0 S 0.0 0.0 0:01.03 ksoftirqd/3
32 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/3:0H-kblockd
33 root 20 0 0 0 0 S 0.0 0.0 0:00.01 cpuhp/4
2021/03/15 13:53:15 Report at: 2021-03-15 13:53:15.789149368 +0000 UTC m=+1443.880773817
2021/03/15 13:53:15 -==-=-=-=-=
2021/03/15 13:53:15 Total number of collected logs : 21600000
2021/03/15 13:53:15 Logs per sec : 14958
2021/03/15 13:53:15 Time from start monitoring (in secs): 1444
2021/03/15 13:53:15 -==-=-=-=-=
2021/03/15 13:53:15 ----------------------------------------------------------------------------------------------------------------------------------------
2021/03/15 13:53:15 | | Current | Lines | | Total | Lines | | | |
2021/03/15 13:53:15 ----------------------------------------------------------------------------------------------------------------------------------------
2021/03/15 13:53:15 | Container name | Logged | Collected | Loss | Logged | Lo./Sec | Collected | Co./Sec | Loss |
2021/03/15 13:53:15 ----------------------------------------------------------------------------------------------------------------------------------------
2021/03/15 13:53:15 | low-log-stress-7c7f49566-49l24 | 19950 | 19950 | 0 | 2160000 | 1495 | 2159700 | 1495 | 300 |
2021/03/15 13:53:15 | low-log-stress-7c7f49566-6rsl2 | 19950 | 19950 | 0 | 2160150 | 1495 | 2160000 | 1495 | 150 |
2021/03/15 13:53:15 | low-log-stress-7c7f49566-6w4gk | 20035 | 20035 | 0 | 2160085 | 1495 | 2159935 | 1495 | 150 |
2021/03/15 13:53:15 | low-log-stress-7c7f49566-8bwsv | 19997 | 19997 | 0 | 2160000 | 1495 | 2159999 | 1495 | 1 |
2021/03/15 13:53:15 | low-log-stress-7c7f49566-8vl2j | 19950 | 19950 | 0 | 2160000 | 1495 | 2159850 | 1495 | 150 |
2021/03/15 13:53:15 | low-log-stress-7c7f49566-c2w56 | 20035 | 20035 | 0 | 2160085 | 1495 | 2160084 | 1495 | 1 |
2021/03/15 13:53:15 | low-log-stress-7c7f49566-f5wp9 | 19998 | 19998 | 0 | 2160198 | 1495 | 2160197 | 1495 | 1 |
2021/03/15 13:53:15 | low-log-stress-7c7f49566-r4c6v | 20185 | 20185 | 0 | 2160385 | 1496 | 2160385 | 1496 | 0 |
2021/03/15 13:53:15 | low-log-stress-7c7f49566-rmz24 | 19950 | 19950 | 0 | 2160000 | 1495 | 2160000 | 1495 | 0 |
2021/03/15 13:53:15 | low-log-stress-7c7f49566-vxfmp | 19950 | 19950 | 0 | 2160000 | 1495 | 2159850 | 1495 | 150 |
2021/03/15 13:53:15
Running the benchmark against various scenarios can be achieved using the auto_execution.sh
script for example::
./contrib/auto_execution.sh -ff=conf/collector/fluentd/partial/
For more details execute with -h
parameter