fluent/fluentd-kubernetes-daemonset

Fluentd cannot communicate with OpenSearch when using HTTPS

yoav-klein opened this issue · 4 comments

I am trying to use the fluentd-daemonset-opensearch.yaml configuration, modifying it a little bit, so my configuration is as such:

 env:
  - name: K8S_NODE_NAME
    valueFrom:
      fieldRef:
        fieldPath: spec.nodeName
  - name: FLUENT_OPENSEARCH_HOST
    value: opensearch.storage.svc.cluster.local
  - name: FLUENT_OPENSEARCH_PORT
    value: "9200"
  - name: FLUENT_OPENSEARCH_INDEX_NAME
    value: fluentd
  - name: FLUENT_OPENSEARCH_SSL_VERIFY
    value: "false"
  - name: FLUENT_OPENSEARCH_SCHEME
    value: "https"
  - name: FLUENT_OPENSEARCH_USER
    value: "admin"
  - name: FLUENT_OPENSEARCH_PASSWORD
    value: "admin"
...

I have an OpenSearch cluster running in my cluster, with a Kubernetes Service opensearch.storage.svc.cluster.local routing traffic to it (tested connectivity - working fine).
My OpenSearch is configured with HTTPS. However, I want fluentd to skip TLS verification.

Now when applying the DaemonSet, it's not working. Running kubectl logs on one of the pods, I see this:

... rest of fluentd configuration
  </filter>
  <match **>
    @type opensearch
    @id out_os
    @log_level "info"
    include_tag_key true
    host "opensearch.storage.svc.cluster.local"
    port 9200
    path ""
    scheme https
    ssl_verify false
    ssl_version TLSv1_2
    ca_file "/etc/ca-certificates.conf"
    user "admin"
    password xxxxxx
    client_cert ""
    client_key ""
    client_key_pass xxxxxx
    index_name "fluentd"
    logstash_dateformat "%Y.%m.%d"
    logstash_format false
    logstash_prefix "logstash"
    logstash_prefix_separator "-"
    <buffer>
      flush_thread_count 1
      flush_mode interval
      flush_interval 60s
      chunk_limit_size 8M
      total_limit_size 512M
      retry_max_interval 30
      retry_timeout 72h
      retry_forever false
    </buffer>
  </match>
</ROOT>
2023-03-20 13:04:33 +0000 [info]: starting fluentd-1.15.3 pid=7 ruby="3.1.3"
2023-03-20 13:04:33 +0000 [info]: spawn command to main:  cmdline=["/usr/local/bin/ruby", "-Eascii-8bit:ascii-8bit", "/fluentd/vendor/bundle/ruby/3.1.0/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--gemfile", "/fluentd/Gemfile", "--under-supervisor"]
2023-03-20 13:04:33 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil
2023-03-20 13:04:34 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil
2023-03-20 13:04:34 +0000 [info]: adding match in @FLUENT_LOG pattern="fluent.**" type="null"
2023-03-20 13:04:34 +0000 [info]: adding filter pattern="kubernetes.**" type="kubernetes_metadata"
2023-03-20 13:04:34 +0000 [info]: adding match pattern="**" type="opensearch"
2023-03-20 13:04:37 +0000 [warn]: #0 [out_os] Could not communicate to OpenSearch, resetting connection and trying again. No such file or directory @ rb_sysopen -  (Errno::ENOENT)
2023-03-20 13:04:37 +0000 [warn]: #0 [out_os] Remaining retry: 14. Retry to communicate after 2 second(s).
2023-03-20 13:04:41 +0000 [warn]: #0 [out_os] Could not communicate to OpenSearch, resetting connection and trying again. No such file or directory @ rb_sysopen -  (Errno::ENOENT)
2023-03-20 13:04:41 +0000 [warn]: #0 [out_os] Remaining retry: 13. Retry to communicate after 4 second(s).
2023-03-20 13:04:49 +0000 [warn]: #0 [out_os] Could not communicate to OpenSearch, resetting connection and trying again. No such file or directory @ rb_sysopen -  (Errno::ENOENT)
2023-03-20 13:04:49 +0000 [warn]: #0 [out_os] Remaining retry: 12. Retry to communicate after 8 second(s).

Can you help me understand what is the problem?

+1, have same problem, k3s.
OS is on separate VM.

This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days

This issue was automatically closed because of stale in 30 days