logstash-plugins/logstash-filter-json

json keys that look like references get dereferenced

bazzargh opened this issue · 6 comments

  • Version: logstash-filter-json (3.0.5) (as part of logstash 6.5.1)
  • Operating System: osx 10.13.6
  • Config File (if you have sensitive info, please remove it):
output {
  stdout {
    codec => rubydebug
  }
}
input {
  generator {
    lines => [
      '{"foo":"bar","nested":{"one":"two"},"not_nested[\"three\"]":"four","[what][is][this]":"five"}'
    ]
    count => 1
  }
}
filter {
  json {
    source => "message"
    remove_field => ["message"]
  }
}
  • Sample Data: in conf above
  • Steps to Reproduce: logstash -f bug.conf

Expected:

{
      "sequence" => 0,
    "@timestamp" => 2018-12-05T18:28:29.623Z,
      "[what][is][this]" => "five",
      "@version" => "1",
          "host" => "laptop.local",
      "not_nested[\"three\"]" => "four",
           "foo" => "bar",
        "nested" => {
        "one" => "two"
    }
}

Actual:

{
      "sequence" => 0,
    "@timestamp" => 2018-12-05T18:28:29.623Z,
          "what" => {
        "is" => {
            "this" => "five"
        }
    },
      "@version" => "1",
          "host" => "laptop.local",
    "not_nested" => {
        "\"three\"" => "four"
    },
           "foo" => "bar",
        "nested" => {
        "one" => "two"
    }
}

We spotted this with log data where a query string had ref-like keys, and got the "Detected ambiguous Field Reference" warning. I was surprised that it's doing anything with this at all, and there doesn't seem to be an option to disable this behaviour - these are keys in our log data not conf for processing.

Similarly,

input { generator { count => 1 lines => [ '{ "foo[1]": "bar" }' ] } }
filter { json { source => "message" } }

will cause logstash to crash

RuntimeError: Invalid FieldReference: `foo[1]`
        set at org/logstash/ext/JrubyEventExtLibrary.java:92
     filter at /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-json-3.0.6/lib/logstash/filters/json.rb:106
       each at org/jruby/RubyHash.java:1419
     filter at /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-json-3.0.6/lib/logstash/filters/json.rb:106
  do_filter at /usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:143

or

org.jruby.exceptions.RuntimeError: (RuntimeError) Invalid FieldReference: `foo[1]`
    at org.logstash.ext.JrubyEventExtLibrary$RubyEvent.set(org/logstash/ext/JrubyEventExtLibrary.java:92) ~[logstash-core.jar:?]
    at usr.share.logstash.vendor.bundle.jruby.$2_dot_5_dot_0.gems.logstash_minus_filter_minus_json_minus_3_dot_0_dot_6.lib.logstash.filters.json.filter(/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-json-3.0.6/lib/logstash/filters/json.rb:106) ~[?:?]
    at org.jruby.RubyHash.each(org/jruby/RubyHash.java:1419) ~[jruby-complete-9.2.7.0.jar:?]
    at usr.share.logstash.vendor.bundle.jruby.$2_dot_5_dot_0.gems.logstash_minus_filter_minus_json_minus_3_dot_0_dot_6.lib.logstash.filters.json.filter(/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-json-3.0.6/lib/logstash/filters/json.rb:106) ~[?:?]
    at usr.share.logstash.logstash_minus_core.lib.logstash.filters.base.do_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:143) ~[?:?]

depending on whether you have java_execution enabled.

I am observing the same issue that @TheVastyDeep describes.

+1 for this headache.
The root cause is https://github.com/logstash-plugins/logstash-filter-json/blob/master/lib/logstash/filters/json.rb#L106
where event.set will dereference JSON keys. Keys with brackets will have unexpected outcomings. For example: [] crashes, [non_existing_field] resolves to null, [welcome[]] crashes.
Currently, we take a workaround by exploiting https://github.com/logstash-plugins/logstash-filter-json/blob/master/lib/logstash/filters/json.rb#L86
With a target set, the parsed JSON will be put into the target, free from the dereferencing bug.

is this problem solved today ?, i meet the same problem in 7.5.5

+1 for this headache. The root cause is https://github.com/logstash-plugins/logstash-filter-json/blob/master/lib/logstash/filters/json.rb#L106 where event.set will dereference JSON keys. Keys with brackets will have unexpected outcomings. For example: [] crashes, [non_existing_field] resolves to null, [welcome[]] crashes. Currently, we take a workaround by exploiting https://github.com/logstash-plugins/logstash-filter-json/blob/master/lib/logstash/filters/json.rb#L86 With a target set, the parsed JSON will be put into the target, free from the dereferencing bug.

can you help me show the detail?
i use

json {
              source => "message_log"
              target => "doc.custom"
              remove_field => ["message","message_log"]
          }

but still hold a lot of memory

Hi,

I'm getting the same issue with my logstash. Any idea how to fix this?