fluent/fluent-plugin-webhdfs

httpFS - Can not create file when it does not exist

hahafamilia opened this issue · 0 comments

<match fluentd.test>
   @type webhdfs
   path /tmp/fluentd/test/test.log
   host myhttpfs.example.com
   port 14000
   httpfs true
   username admin
   flush_interval 5s
</match>
2020-01-30 19:17:31 +0900 [warn]: #0 failed to flush the buffer. retry_time=0 next_retry_seconds=2020-01-30 19:17:32 +0900 chunk="59d58c35c3f9c0fc061dabc8b3243994" error_class=WebHDFS::ServerError error="Failed to connect to host myhttpfs.example.com:14000, end of file reached"
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:345:in `rescue in request'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:342:in `request'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:273:in `operate_requests'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:73:in `create'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.4/lib/fluent/plugin/out_webhdfs.rb:274:in `rescue in send_data'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.4/lib/fluent/plugin/out_webhdfs.rb:271:in `send_data'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.4/lib/fluent/plugin/out_webhdfs.rb:389:in `block in write'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.4/lib/fluent/plugin/out_webhdfs.rb:335:in `compress_context'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.4/lib/fluent/plugin/out_webhdfs.rb:388:in `write'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.7.4/lib/fluent/plugin/output.rb:1125:in `try_flush'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.7.4/lib/fluent/plugin/output.rb:1431:in `flush_thread_run'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.7.4/lib/fluent/plugin/output.rb:461:in `block (2 levels) in start'
  2020-01-30 19:17:31 +0900 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.7.4/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'                                                                                                  2020-01-30 19:17:34 +0900 [warn]: #0 failed to communicate hdfs cluster, path: /tmp/fluentd/20200130/access.log

I am using Cloudera CDH 6.1.
I have configured the plugin to use 'httpfs'.
Plugin can not create file When the file does not exist.
I read the issues-46.
I think I found the cause in the cloudera document.
Please Can you check this link?

Create and Write to a file

Note that the reason of having two-step create/append is for preventing clients to send out data before the redirect. 
This issue is addressed by the “Expect: 100-continue” header in HTTP/1.1; see RFC 2616, Section 8.2.3. 
Unfortunately, there are software library bugs(e.g. Jetty 6 HTTP server and Java 6 HTTP client), 
which do not correctly implement “Expect: 100-continue”. 
The two-step create/append is a temporary workaround for the software library bugs.

RFC 2616, Section 8.2.3.

The file was created when I tested not s ending any data.
The request must include header 'Content-Type: application/octet-stream'.