kiwigrid/helm-charts

[BUG] fluentd-elasticsearch config error file="/etc/fluent/fluent.conf" error_class=Fluent::ConfigError error="Invalid Kubernetes API v1 endpoint

Closed this issue · 10 comments

Is this a request for help?: No


Is this a BUG REPORT or FEATURE REQUEST? BUG REPORT

Version of Helm and Kubernetes:
Helm version:
Client: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.13.1", GitCommit:"618447cbf203d147601b4b9bd7f8c37a5d39fbb4", GitTreeState:"clean"}

Kubernetes version: 1.11.9

Which chart in which version:
kiwigrid/fluentd-elasticsearch 4.8.5 (i assume this version as I just installed using kiwigrid/fluentd-elasticsearch)

What happened:
One of the pods keeps failing with error:
config error file="/etc/fluent/fluent.conf" error_class=Fluent::ConfigError error="Invalid Kubernetes API v1 endpoint https://some-ip-address:443/api: Timed out connecting to server"

and logs are not available in my elk stack

What you expected to happen:
pod should not fail and logs should appear

How to reproduce it (as minimally and precisely as possible):
I keep deleting the deployment and reinstalling and it keeps failing

Same issue here in a clear k8s cluster v1.15.4 created with kubeadm. It starts in the master node but it fails with that message in all worker nodes.

Looking at this issue fluent/fluentd-kubernetes-daemonset#169 it seems the problem is with the kubernetes_metadata_filter. I edited the configmap kubectl edit configmap fluentd-fluentd-elasticsearch removed the filter from the configuration and now the pods start in the worker nodes as well. It would be great to have an option to specify if this plug-in must be included or not

Without this plugin elasticsearch fields like container id, pod name and so on would be missing.
I'll check if there is a new version of the plugin available and if so i'll build an updated container image.

I've added a new image tag v3.0.0 to the registry.
Could you please try if htis solves the problem for you?

I'm sorry. I tested it thoroughly and I have a problem with my cluster in which pods can't contact the network at all. The fix just removed a component that tried to contact the k8s-api server but then it failed to contact ES because it can't contact anything outside of its container.

Sorry, needs to be tested but we don't have any fluentd-elasticsearch clusters available at the moment. So this can take some time....

I'm observing the same behavior with fluent/fluentd-kubernetes-daemonset:v1.7.4-debian-elasticsearch7-1.0 on K8s v1.15: commenting out filter kubernetes_metadata allows the container to reach the K8s API server successfully and thus finish starting up on a worker node.

@feromax

I've build a new docker image based on debian buster and updated fluentd and fluentd plugins.
Could you try out if this solves the problem for you, without commenting out metadata filter?

The new image is available here:

monotek/fluentd-elasticsearch:50

Please try together with newest chart version: 5.2.0

@monotek thank you for responding! it turns out the issue was a CNI (Flannel) misconfiguration that was preventing fluentd from being able to reach the K8s API server -- the fluent/fluentd-kubernetes-daemonset:v1.7.4-debian-elasticsearch7-1.0 image is working OK.

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.