logstash-plugins/logstash-input-s3

Metadata missing from last event when using multiline codec

Closed this issue · 7 comments

Opening this issue as a result of this forum post: https://discuss.elastic.co/t/multiline-plugin-metadata-missing-from-last-line/136725

To summarize, I have noticed that when combining multiple lines into one event using the multiline codec, the metadata for the last line of the file is missing. From the discussion, it looks like [metadata][s3][key] is not set on the event when the codec is flushed (line 220).

My configuration:

input {
        s3{
                bucket => "bucket_name"
		region => "us-east-2"
		codec => multiline {
		        pattern => "^(%{DATESTAMP})"
		        negate => "true"
		        what => "previous"
		}
        }
}
filter { mutate { add_field => { "file_name" => "%{[@metadata][s3][key]}"}} }
output{ stdout { codec => rubydebug } }

Sample input file:

06-19-2018 15:25:35.7046|ERROR
	more info...
06-19-2018 15:25:35.7046|DEBUG
	more info...
06-19-2018 15:25:35.7046|INFO
	more info...

Logstash output:

{
    "@timestamp" => 2018-06-20T14:41:09.998Z,
       "message" => "06-19-2018 15:25:35.7046|ERROR\r\n\tmore info...\r",
          "tags" => [
        [0] "multiline"
    ],
      "@version" => "1",
     "file_name" => "sampleLog.txt"
}
{
    "@timestamp" => 2018-06-20T14:41:09.998Z,
       "message" => "06-19-2018 15:25:35.7046|DEBUG\r\n\tmore info...\r",
          "tags" => [
        [0] "multiline"
    ],
      "@version" => "1",
     "file_name" => "sampleLog.txt"
}
{
    "@timestamp" => 2018-06-20T14:41:09.999Z,
       "message" => "06-19-2018 15:25:35.7046|INFO\r\n\tmore info...\r",
          "tags" => [
        [0] "multiline"
    ],
      "@version" => "1",
     "file_name" => "%{[@metadata][s3][key]}"
}

Have same trouble, and all these features of multi-line codec fade behind this issue. All trace-backs i logging do not have metadata and they are useless because of this.

In s3 input plugin, when codec multiline is used(actually used), the metadata is missing and add_field is useless(seemingly add_tag is useless), current version is 6.7.0.

I will try to use version 7.0. If codec multiline still causes metadada missing.


I have tried it in version 7.0.0, the metadata is still missing......

My final plan is use filter plugin: aggregate, but I need to set Logstash filter workers to 1 (-w 1 flag).

I'm facing the same issue.
If I use the multiline codec with this plugin, type, add_field & tags - all are ineffective.

What's the best workaround? My constraints are that I HAVE to set any of the above fields in the input itself so I can use it to conditionally output the events to separate destinations.

I can't even use the id in conditionals, since it doesn't stick to events.

UPDATE: I fixed it in my fork, added a PR

It's not just the decorate that is missing when the codec is flushed. All of the metadata handling is skipped. Basically, all of the stuff done in the '@codec.decode(line) do |event|' loop also has to be done in the '@codec.flush do |event|' block.

Hi,

Just hit this as well... Just wondering if there are any plans to fix this anytime soon.

Thanks!

@yaauie Maybe given how many PRs there are (#218,#190,#173 - see also elastic/logstash#9686) could you consider merging one of them?