logstash-plugins/logstash-output-tcp

tcp output Connection reset by peer - silent failure

test-in-prod opened this issue · 1 comments

Hi,

I have here several Windows servers running the logstash agent via NSSM in service mode. They all collect IIS and Win32 event logs, then use the tcp output to forward events to a more central logstash server that handles alerting and distributes the data onward to elasticsearch.

output {
  tcp {
    host => "master-logstash.example.org"   
    port => "5115"
    codec => "json_lines"
  }
}

It seems that occasional network hiccups will kill the TCP connection and the originating logstash agents will silently stop forwarding (processing?) events. After the network settles, the connection does not seem to get re-established, restarting logstash resolves this every time.

This is what gets logged when it happens:

{:timestamp=>"2015-09-24T23:06:19.033000-0700", :message=>"tcp output exception", :host=>"master-logstash.example.org", :port=>5115, :exception=>#<Errno::ECONNABORTED: Software caused connection abort - An established connection was aborted by the software in your host machine>, :backtrace=>["org/jruby/RubyIO.java:3020:in `sysread'", "D:/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-tcp-1.0.0/lib/logstash/outputs/tcp.rb:105:in `register'", "org/jruby/RubyProc.java:271:in `call'", "D:/logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-json_lines-1.0.0/lib/logstash/codecs/json_lines.rb:49:in `encode'", "D:/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-tcp-1.0.0/lib/logstash/outputs/tcp.rb:143:in `receive'", "D:/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/outputs/base.rb:88:in `handle'", "(eval):293:in `output_func'", "D:/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/pipeline.rb:244:in `outputworker'", "D:/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/pipeline.rb:166:in `start_outputs'"], :level=>:warn}

{:timestamp=>"2015-09-25T19:28:16.742000-0700", :message=>"tcp output exception", :host=>"master-logstash.example.org", :port=>5115, :exception=>#<Errno::ECONNRESET: Connection reset by peer - Connection reset by peer>, :backtrace=>["org/jruby/RubyIO.java:3020:in `sysread'", "D:/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-tcp-1.0.0/lib/logstash/outputs/tcp.rb:105:in `register'", "org/jruby/RubyProc.java:271:in `call'", "D:/logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-json_lines-1.0.0/lib/logstash/codecs/json_lines.rb:49:in `encode'", "D:/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-tcp-1.0.0/lib/logstash/outputs/tcp.rb:143:in `receive'", "D:/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/outputs/base.rb:88:in `handle'", "(eval):293:in `output_func'", "D:/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/pipeline.rb:244:in `outputworker'", "D:/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.3-java/lib/logstash/pipeline.rb:166:in `start_outputs'"], :level=>:warn}

+1