logstash-plugins/logstash-filter-json

Setting "target" and "source" to "message" silently drops events

jcmcken opened this issue · 5 comments

  • Version: 5.5.0
  • Operating System: CentOS 7.3
  • Config File (if you have sensitive info, please remove it):
      json {
        source => "message"
        target => "message"
        skip_on_invalid_json => true
      }

  • Sample Data:

This is a Mesos task log:

Registered docker executor on 10.x.x.x
Starting task sometask
{"foo": "bar", "baz": 1}
{"foo": "bar", "baz": 1}
{"foo": "bar", "baz": 1}
{"foo": "bar", "baz": 1}
  • Steps to Reproduce:

Send sample data with Filebeat to Logstash with JSON filter documented above.

The input messages should be of the form:

...etc...
{"message": "Starting task sometask", ...}
{"message": "{\"foo\": \"bar\", \"baz\": 1}", ...}
...etc...

The output message should look like:

...etc...
{"message": "Starting task sometask", ...}
{"message": {"foo": "bar", "baz": 1}, ...}
...etc...

Instead, the invalidly parsed events (e.g. Starting task sometask) pass through, but the JSON events are silently dropped.

If I change target to something else (e.g. target => "json"), it works as expected.

Perhaps related to this issue:

https://discuss.elastic.co/t/json-filter-dropping-messages/100144

I will try the workaround (assign a different name to target).

With decent volume of data, this bug is easily reproducible. Json filter is silently dropping messages regardless of configuration settings as mentioned above.

Here is my new configuration for json filter:

filter{
json{
source => "message"
target => "jsonMsg"
skip_on_invalid_json => true
remove_field => "message"
}
}

Sample json:
I'm able to repeat this behavior with any json sample.

Input and output:
I am reading from a kafka stream and persisting to a local file. I post 10,000 messages to Kafka and check the output file after each run.

Versions of Logstash I've tried:

5.4.3
5.5.2

I am seeing the same issue even in our DEV environment with a relatively low log line count. We are sending log lines via filebeat, then logstash filters using json and sends to elasticsearch (all version 6.4.1). Viewing in Kibana shows not even half the events.
I have found that consistently the same events seem to make it through but other ones with exactly the same timestamp get dropped. So when I have 3 events with the timestamp of 2018-12-04T06:04:11.958-06:00 only one will make it through to Elasticsearch.

Has anybody found any workaround?

Facing the same problem. Any updates on this ?