logstash-plugins/logstash-codec-fluent

@timestamp field ignored in actual message

awasthi-vivek opened this issue · 5 comments

The event structure of Fluentd consists of the following:

  • Tag
  • Time (Epoch time)
  • record (Actual log content - JSON format)

When fluentd event is received and decoded, the Time (epoch time) is used as timestamp when creating the Logstash event which sets the time for the event. Since the Time is an epoch time, milliseconds can't be set. Even if there is a field in the actual log content called @timestamp, it gets overwritten.

In our case, we set the @timestamp value in the record with a precision grater than epoch time and we do not wish this to be overwritten.

Here is the code setting @timestamp in the plugin

Can we have an option to NOT use the Time to set the Logstash event time and let logstash use the @timestamp field instead if present in actual log?

This way we're not just limited to seconds and can have time in finer precision

Fluentd protocol does not permit to set arbitrary timestamp which consists of String object.
time field can handle integer(epochtime) and EventTime(will be described as below).

Instead, I proposed another higher precision handling patch: #18

This patch relies on this Fluentd protocol specification: https://github.com/fluent/fluentd/wiki/Forward-Protocol-Specification-v1#eventtime-ext-format

Fluentd protocol does not permit to set arbitrary timestamp which consists of String object.

Is there a link for this statement in the specification?

I agree that fluentd protocol and logstash plugin should support higher precision EventTime and it's good that you are addressing this via the patch for the plugin.

However, this is about the logstash plugin behavior of currently over-writing a higher precision @timestamp value with a lower precision Time value. I think if @timestamp is present, THEN the original value should be honored.

Fluentd protocol does not permit to set arbitrary timestamp which consists of String object.

Is there a link for this statement in the specification?

time is a EventTime value (described below), or a number of seconds since Unix epoch.

ref: https://github.com/fluent/fluentd/wiki/Forward-Protocol-Specification-v1#event-modes

I agree that time will be a EventTime or number of seconds since Unix epoch.

What I mean is if there is a @timestamp present inside the actual message, for example:

<tag> <time> <actual log in JSON format>

test 1519124241 {"@timestamp": "2018-02-20T10:51:06+00:00", ...}

In this case time represented by @timestamp should be used and not overwritten by <time>

If I'm not wrong, the Logstash will only add @timestamp if not present.

logstash will choose a timestamp based on the first time it sees the event (at input time), if the timestamp is not already set in the event

When using JSON Codec, the behavior is expected but Fluent codec differ from the expected behavior.

Hi,

is there some update I'm facing this issue too.

Thanks
Marcel