logstash-plugins/logstash-filter-geoip

Output of geoip filter changed in logstash 5.4.2 (no more GeoJSON)

matejzero opened this issue · 12 comments

Hello,

I'm testing logstash 5.4.2 in my staging env and I've noticed that output of geoip plugin has changed. Instead of GeoJSON output, like I had in 5.4.1

    "location": [
      -122.3042,
      47.913
    ],

I now get a hash of lat / lon.

    "location": {
      "lat": 47.913,
      "lon": -122.3042
    },

My geoip configuration looks like this:

    geoip {
      source => [ "src_ip" ]
      fields => [ "country_code2", "country_name", "latitude", "longitude", "location" ]
    }

Looking at the latest commits, I saw that this was changed when some of the code was rewritten to Java. In old Ruby plugin, location was saved as array, but in new Java code, it's saved as Hash.

Java code: ec18789#diff-0db10ce49673bb86356e5398c93ad8b3R257
Ruby code: ec18789#diff-0ceaeedc6497c6d61fa5ae1d2db1dc58L212

Is this to be expected? Will ES still talk hash as geopoint?

For all general issues, please provide the following details for fast resolution:

  • Version: 5.4.1 & 5.4.2
  • Operating System: Linux
  • Steps to Reproduce: Just use geoip filter and look at location field.

I looked at ES documentation and saw that it can take hash as geopoint type as well: https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html

Hi @matejzero. Thanks for your posting, I'm on the same situation (see above).

I understand that ES works on both types anyway, but I don't think this issue should be closed so easily. As a result of the update, we've now got an index with two mixed mappings. IMHO, there should have been at least a warning about the type change.

I'm currently using the default template provided by logstash ("version" : 50001). (Edited, no pun intended)

Interesting... location field is set to geo_point in that template.

"geoip"  : {
          "dynamic": true,
          "properties" : {
            "ip": { "type": "ip" },
            "location" : { "type" : "geo_point" },
            "latitude" : { "type" : "half_float" },
            "longitude" : { "type" : "half_float" }
          }
        }

How is location field mapped now that you upgraded? If you look at the latest index, what type is location field? It should be geo_point if you are using default template at least to my knowledge.

I might have to downgrade my logstash instances as well, since my indexes rotate tomorrow.

I just tried this on my cluster and even with the latest logstash, location field still gets mapped as geo_point and I don't get mixed templates.

Are you sure your indexes use latest logstash template?

Can you paste your logstash template from elasticsearch: curl -XGET es_host:9200/_template/logstash

You might have an old template there that doesn't set geoip fields.

I think so. I've attached the output of
curl -XGET es_host:9200/_template/logstash?pretty

issue-123-logstash-template.txt

Weird, this template should not produce mixed mappings for location field, since mapping is set in template.

Can you look at your mappings in indexes and check what type is location?

Sure, here it is: issue-123-logstash-2017.06.21-mapping.txt

The offending index here is logstash-2017.06.21, the one related to the day when I upgraded to 5.4.2. geoip.location is set to geo_point. Now that I see it, the mapping seems to be fine, but I'm having differences in the format the values are stored (as discussed on the other ticket).

By the way, thanks for your help. I'm new on ES issues and still am a bit lost but slowly starting to grasp it.

I went and read your issue again and now I get it. Your field type stayed the same, but values saved are different (hash instead of array). That's the reason reindexing fails.

I replied to your ticket, I think it's best to continue debate there there, since it's a different problem then what I have.

The following config works with Logstash 5.4.1 but not with 5.4.2 (which I think is geoip 4.1.1)

filter {
    if [Source][IP] {
        geoip {
            source => "[Source][IP]"
            target => "Source"
        }
    }
    if [Destination][IP] {
        geoip {
            source => "[Destination][IP]"
            target => "Destination"
        }
    }

}

I get the following error with Logstash 5.4.2. Note that there is no elasticsearch output going on, so I know that it's not a template issue.

[
{"exception"=>"Missing Ruby class handling for full class name=java.util.HashMap, simple name=HashMap",
"backtrace"=>["org.logstash.Javafier.deep(org/logstash/Javafier.java:29)",
"org.logstash.Event.getField(org/logstash/Event.java:151)",
"org.logstash.filters.GeoIPFilter.applyGeoData(org/logstash/filters/GeoIPFilter.java:151)",
"org.logstash.filters.GeoIPFilter.handleEvent(org/logstash/filters/GeoIPFilter.java:143)",
"java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:498)", 
"RUBY.filter(/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-filter-geoip-4.1.1-java/lib/logstash/filters/geoip.rb:122)",
"LogStash::Filters::Base.do_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:145)",
"LogStash::Filters::Base.do_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:145)", 
"LogStash::Filters::Base.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:164)", 
"LogStash::Filters::Base.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:164)", 
"org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)",
"LogStash::Filters::Base.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:161)",
"LogStash::Filters::Base.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filters/base.rb:161)", 
"LogStash::FilterDelegator.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:43)", 
"LogStash::FilterDelegator.multi_filter(/usr/share/logstash/logstash-core/lib/logstash/filter_delegator.rb:43)",
"RUBY.initialize((eval):26185)",
"org.jruby.RubyArray.each(org/jruby/RubyArray.java:1613)", 
"RUBY.initialize((eval):26181)",
"org.jruby.RubyProc.call(org/jruby/RubyProc.java:281)", 
"RUBY.filter_func((eval):7655)", 
"LogStash::Pipeline.filter_batch(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:370)",
"LogStash::Pipeline.filter_batch(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:370)", 
"org.jruby.RubyProc.call(org/jruby/RubyProc.java:281)", 
"LogStash::Util::WrappedSynchronousQueue::ReadBatch.each(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:224)", 
"LogStash::Util::WrappedSynchronousQueue::ReadBatch.each(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:224)",
"org.jruby.RubyHash.each(org/jruby/RubyHash.java:1342)", 
"LogStash::Util::WrappedSynchronousQueue::ReadBatch.each(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:223)", 
"LogStash::Util::WrappedSynchronousQueue::ReadBatch.each(/usr/share/logstash/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:223)", 
"LogStash::Pipeline.filter_batch(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:369)", 
"LogStash::Pipeline.filter_batch(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:369)", 
"RUBY.worker_loop(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:350)", 
"RUBY.start_workers(/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:317)",
"java.lang.Thread.run(java/lang/Thread.java:748)"]}
]

I can only suspect that it may be due to 45ac731#commitcomment-22920532