Fluentd cannot communicate with OpenSearch when using HTTPS
yoav-klein opened this issue · 4 comments
I am trying to use the fluentd-daemonset-opensearch.yaml
configuration, modifying it a little bit, so my configuration is as such:
env:
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: FLUENT_OPENSEARCH_HOST
value: opensearch.storage.svc.cluster.local
- name: FLUENT_OPENSEARCH_PORT
value: "9200"
- name: FLUENT_OPENSEARCH_INDEX_NAME
value: fluentd
- name: FLUENT_OPENSEARCH_SSL_VERIFY
value: "false"
- name: FLUENT_OPENSEARCH_SCHEME
value: "https"
- name: FLUENT_OPENSEARCH_USER
value: "admin"
- name: FLUENT_OPENSEARCH_PASSWORD
value: "admin"
...
I have an OpenSearch cluster running in my cluster, with a Kubernetes Service opensearch.storage.svc.cluster.local
routing traffic to it (tested connectivity - working fine).
My OpenSearch is configured with HTTPS. However, I want fluentd to skip TLS verification.
Now when applying the DaemonSet, it's not working. Running kubectl logs
on one of the pods, I see this:
... rest of fluentd configuration
</filter>
<match **>
@type opensearch
@id out_os
@log_level "info"
include_tag_key true
host "opensearch.storage.svc.cluster.local"
port 9200
path ""
scheme https
ssl_verify false
ssl_version TLSv1_2
ca_file "/etc/ca-certificates.conf"
user "admin"
password xxxxxx
client_cert ""
client_key ""
client_key_pass xxxxxx
index_name "fluentd"
logstash_dateformat "%Y.%m.%d"
logstash_format false
logstash_prefix "logstash"
logstash_prefix_separator "-"
<buffer>
flush_thread_count 1
flush_mode interval
flush_interval 60s
chunk_limit_size 8M
total_limit_size 512M
retry_max_interval 30
retry_timeout 72h
retry_forever false
</buffer>
</match>
</ROOT>
2023-03-20 13:04:33 +0000 [info]: starting fluentd-1.15.3 pid=7 ruby="3.1.3"
2023-03-20 13:04:33 +0000 [info]: spawn command to main: cmdline=["/usr/local/bin/ruby", "-Eascii-8bit:ascii-8bit", "/fluentd/vendor/bundle/ruby/3.1.0/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--gemfile", "/fluentd/Gemfile", "--under-supervisor"]
2023-03-20 13:04:33 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil
2023-03-20 13:04:34 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil
2023-03-20 13:04:34 +0000 [info]: adding match in @FLUENT_LOG pattern="fluent.**" type="null"
2023-03-20 13:04:34 +0000 [info]: adding filter pattern="kubernetes.**" type="kubernetes_metadata"
2023-03-20 13:04:34 +0000 [info]: adding match pattern="**" type="opensearch"
2023-03-20 13:04:37 +0000 [warn]: #0 [out_os] Could not communicate to OpenSearch, resetting connection and trying again. No such file or directory @ rb_sysopen - (Errno::ENOENT)
2023-03-20 13:04:37 +0000 [warn]: #0 [out_os] Remaining retry: 14. Retry to communicate after 2 second(s).
2023-03-20 13:04:41 +0000 [warn]: #0 [out_os] Could not communicate to OpenSearch, resetting connection and trying again. No such file or directory @ rb_sysopen - (Errno::ENOENT)
2023-03-20 13:04:41 +0000 [warn]: #0 [out_os] Remaining retry: 13. Retry to communicate after 4 second(s).
2023-03-20 13:04:49 +0000 [warn]: #0 [out_os] Could not communicate to OpenSearch, resetting connection and trying again. No such file or directory @ rb_sysopen - (Errno::ENOENT)
2023-03-20 13:04:49 +0000 [warn]: #0 [out_os] Remaining retry: 12. Retry to communicate after 8 second(s).
Can you help me understand what is the problem?
+1, have same problem, k3s.
OS is on separate VM.
I've posted a fix here fluent/fluent-plugin-opensearch#99 (comment)
This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days
This issue was automatically closed because of stale in 30 days