logstash-plugins/logstash-filter-grok

[v3.2.0] Grok errors while ingesting 300k lines of apache log dataset

suyograo opened this issue · 1 comments

I am ingesting 300k apache log lines with alpha5 RC (which has logstash-filter-grok 3.2.0) and hands on workshop config. See test suite: https://github.com/elastic/logstash/blob/feature/bats_integration_tests/qa/integration/es_how_tests.bats

I got the following errors on console. All log lines made it in, but I see these errors thrown during the run:

Grok regexp threw exception {:exception=>"can't add a new key into hash during iteration", :backtrace=>["org/jruby/RubyHash.java:992:in `[]='", "/tmp/logstash-5.0.0-alpha5/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-3.2.0/lib/logstash/filters/grok/timeout_enforcer.rb:34:in `start_thread_groking'", "org/jruby/ext/thread/Mutex.java:149:in `synchronize'", "/tmp/logstash-5.0.0-alpha5/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-3.2.0/lib/logstash/filters/grok/timeout_enforcer.rb:33:in `start_thread_groking'", "/tmp/logstash-5.0.0-alpha5/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-3.2.0/lib/logstash/filters/grok/timeout_enforcer.rb:16:in `grok_till_timeout'", "/tmp/logstash-5.0.0-alpha5/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-3.2.0/lib/logstash/filters/grok/timeout_enforcer.rb:14:in `grok_till_timeout'", "/tmp/logstash-5.0.0-alpha5/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-3.2.0/lib/logstash/filters/grok.rb:331:in `match_against_groks'", "org/jruby/RubyArray.java:1613:in 
`each'", "/tmp/logstash-5.0.0-alpha5/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-3.2.0/lib/logstash/filters/grok.rb:328:in `match_against_groks'", "/tmp/logstash-5.0.0-alpha5/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-3.2.0/lib/logstash/filters/grok.rb:317:in `match'", "/tmp/logstash-5.0.0-alpha5/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-3.2.0/lib/logstash/filters/grok.rb:283:in `filter'", "org/jruby/RubyHash.java:1342:in `each'", "/tmp/logstash-5.0.0-alpha5/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-3.2.0/lib/logstash/filters/grok.rb:282:in `filter'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/filters/base.rb:155:in `multi_filter'", "org/jruby/RubyArray.java:1613:in `each'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/filters/base.rb:152:in `multi_filter'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/filter_delegator.rb:42:in `multi_filter'", "(eval):127:in `filter_func'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/pipeline.rb:250:in `filter_batch'", "org/jruby/RubyProc.java:281:in `call'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:155:in `each'", "org/jruby/RubyHash.java:1342:in `each'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/util/wrapped_synchronous_queue.rb:154:in `each'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/pipeline.rb:248:in `filter_batch'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/pipeline.rb:236:in `worker_loop'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/pipeline.rb:215:in `start_workers'"], :class=>"RuntimeError", :level=>:warn}

Also, at the end (when the dataset ingestion is complete):

ipeline aborted due to error {:exception=>#<NoMethodError: undefined method `stop!' for nil:NilClass>, :backtrace=>["/tmp/logstash-5.0.0-alpha5/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-3.2.0/lib/logstash/filters/grok.rb:393:in `close'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/plugin.rb:75:in `do_close'", "org/jruby/RubyArray.java:1613:in `each'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/pipeline.rb:385:in `shutdown_workers'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/pipeline.rb:160:in `run'", "/tmp/logstash-5.0.0-alpha5/logstash-core/lib/logstash/agent.rb:240:in `start_pipeline'"], :level=>:error}

Ah, missed a mutex synchronize in #cancel_timed_out that will be an easy fix.