Add queue persistence to make pipeline resilient to crashes and restarts
suyograo opened this issue · 7 comments
Currently Logstash uses in-memory bounded queues between pipeline stages (input to filter, filter to output) to buffer events (see documentation for more information). Currently the size of the queue is fixed to 20 and is not configurable. It is possible to lose any queued events if Logstash is terminated unsafely. To prevent event loss in these scenarios, we plan to persist these queues to disk.
#1939 is work in progress by @colinsurprenant to add persistent queue framework which will be used to address this issue
👍
May I suggest that this concept of the internal queue "provider" be abstracted in logstash and implemented by a plugin/extensible mechanism? So rather than just hardwiring in #1939 directly as a top level option, make the top level options be something like
--internal-queue-plugin [internal-queue-plugin-name] [--queue-plugin-option-A] ....
So this way logstash could ship w/ two out of the box:
- an in-memory backward behavior compatible version (as it works today)
- the "persistent" one being developed @ #1939
It would just provide a more extensible hook point for further improvements down the road and let others implement whatever they wish for logstash's internal queue mechanism.
Hi
Could anyone provide an update about this issue?
Being able to reliable send logs to ES using LS to enrich them during the trip would be a great feature.
If I'm not wrong, Filebeat seems to be reliable but it can't do as many things as LS does.
Regards.
@mostolog I'm not sure what you are meaning by reliable
- the persistence we are working on is to prevent data loss on temporary machine faults (power loss, process crash, etc). When Logstash is online, it shouldn't be losing things in-flight. If you discover otherwise, please file a separate ticket describing your symptoms.
As for an update, we're working on this feature.
The first implementation progress can be tracked in #5638