GoogleCloudPlatform/fluent-plugin-detect-exceptions

Dramatic increase in FluentD CPU usage after enabling exception detector plugin

caseyclarkjamf opened this issue · 0 comments

We're seeing an increase in CPU usage on out FluentD Pods from around 20% to 100% CPU usage after enabling the exception detector plugin. This also causes HPA to kick in on the FluentD Pods, scaling them from 3 Pods to 14 Pods with many of those maintaining 100% CPU usage. The same amount of applications (generating generally the same amount of logs) are running during this time. The CPU reliably comes back down after disabling the plugin.

Exception detector config:

<match kubernetes.**>
    @type detect_exceptions
    @id clusterflow:logging:default-s3:0
    languages ["java"]
    multiline_flush_interval 0.1
    remove_tag_prefix kubernetes
</match>

FluentD version: v1.14.6
Exception detector plugin version: 0.0.14

Has anyone else seen the behavior after enabling the plugin and know of any ways to improve performance?

Please let me know if you need any more information.