fluent/fluentd-docs

Transfer made using UDP may occur data loss if the consumer is down

ameyrk18 opened this issue · 3 comments

Hi,

Looks like we need to specify if we choose UDP over TCP for the data transfer the data will be a loss if the receiver goes down https://docs.fluentd.org/v1.0/articles/high-availability under the failure case scenarios?

For example this issue: https://github.com/emsearcy/fluent-plugin-gelf/issues/7

Thanks,
Amey

Hmm... I'm not sure your detailed deployment but HA deployment can't recover UDP potential data loss issue.

Looks like we need to specify if we choose UDP over TCP for the data transfer the data will be a loss if the receiver goes down https://docs.fluentd.org/v1.0/articles/high-availability under the failure case scenarios?

That pretty much depends on how the particular plugin is designed.

Fluentd core provides a set of APIs to plugins and is capable of handling
failure scenarios (e.g. resending records when the destination node is down)
for these plugins, if these plugins are willing to communicate with the
core well.

For example, see how fluent-plugin-gelf implements the write() interface:

https://github.com/emsearcy/fluent-plugin-gelf/blob/master/lib/fluent/plugin/out_gelf.rb#L112

If you choose UDP as protocol, this method always returns successfully even
if the target node is non-existent. This means that Fluentd core never knows
if the data was transmitted to the destination node successfully, since the
plugin makes no attempt at all to notify the status.

@repeatedly my stack has set fluentd instances shipping logs to log forwarder instance and from there to load balancer and from there to graylog (consumer). My question was if we are using aggregators and chose UDP over tcp to transmit messages to the consumer, there will be a data loss if the consumer is down. The document doesn't say about this failure scenario. I believe there should be some note about this scenario in this section.

@fujimotos thanks for that.