logstash-plugins/logstash-input-elasticsearch

Add (socket) timeout support in configuration options

Closed this issue · 6 comments

For "complex" queries to an elasticsearch instance with significant data (e.g. over 5M documents), I get the following error:

{ 2058 rufus-scheduler intercepted an error:
  2058   job:
  2058     Rufus::Scheduler::CronJob "* * * * *" {}
  2058   error:
  2058     2058
  2058     Manticore::SocketTimeout
  2058     Read timed out
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/manticore-0.6.4-java/lib/manticore/response.rb:37:in `block in initialize'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/manticore-0.6.4-java/lib/manticore/response.rb:79:in `call'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/manticore-0.6.4-java/lib/manticore/response.rb:274:in `call_once'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/manticore-0.6.4-java/lib/manticore/response.rb:158:in `code'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/transport/http/manticore.rb:85:in `block in perform_request'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/transport/base.rb:262:in `perform_request'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/transport/http/manticore.rb:68:in `perform_request'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/elasticsearch-transport-5.0.5/lib/elasticsearch/transport/client.rb:131:in `perform_request'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/elasticsearch-api-5.0.5/lib/elasticsearch/api/actions/search.rb:183:in `search'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/logstash-input-elasticsearch-4.5.0/lib/logstash/inputs/elasticsearch.rb:298:in `search_request'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/logstash-input-elasticsearch-4.5.0/lib/logstash/inputs/elasticsearch.rb:246:in `do_run_slice'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/logstash-input-elasticsearch-4.5.0/lib/logstash/inputs/elasticsearch.rb:227:in `do_run'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/logstash-input-elasticsearch-4.5.0/lib/logstash/inputs/elasticsearch.rb:210:in `block in run'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:234:in `do_call'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:258:in `do_trigger'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:300:in `block in start_work_thread'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:299:in `block in start_work_thread'
  2058       org/jruby/RubyKernel.java:1446:in `loop'
  2058       /home/user/opt/logstash-7.6.1/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:289:in `block in start_work_thread'

It seems like when the query takes too much time to get an answer, I get the SocketTimeout error.
When setting a custom socket_timeout in the transport_options to 120s, I do not get the error anymore.
This has been done by changing directly elasticsearch.rb, l177:

transport_options = {:socket_timeout => 120}

Could it be possible to have a new configuration option for the input plugin to set the socket_timeout?

If needed I could submit a PR.

  • Version: 4.5.0
  • Operating System: Linux
  • Config File:
input {
  elasticsearch {
    hosts => [ "elasticsearch:9200" ]
    index => "myindex.*"
    query => '{"query":{"function_score":{"query":{"bool":{"must":[{"exists":{"field":"win"}},{"exists":{"field":"request"}},{"exists":{"field":"processed"}}]}},"random_score":{}}},"size":5000}'
    schedule => "* * * * *"
    scroll => "2m"
  }
}

I think the problem consists in loading 5M document in a row. Depending on Elasticsearch version you can use scroll query (for old ES) or paged query

I'm not loading 5M documents. The query would actually hit like 50k documents (out of the 5M), but would take about 10s to return, which is above the default timeout (5s I think?)

Even if you set the size really low, any large index will timeout. This needs to be configurable

Even if you set the size really low, any large index will timeout. This needs to be configurable

i agree with you,please make it configurable

Same problem here

The HTTP adapter we use here (Manticore) has three different timeouts (only two of which are standard to Faraday library that the Elasticsearch client uses), and we do not override any of the default values:

request_timeout (integer) — default: 60 — Sets the timeout for requests.
connect_timeout (integer) — default: 10 — Sets the timeout for connections.
socket_timeout* (integer) — default: 10 — Sets `SO_TIMEOUT` for open connections.

Where SO_TIMEOUT is a Java setting for blocking operations on a socket. It looks like socket_timeout is Manticore-specific, and although we provide Manticore as a dependency of this plugin, we do not currently tie ourselves to adapter-specific implementation and may need to do additional work to ensure that Faraday can't accidentally select a different adapter.