bsm/poseidon_cluster

Hitting "Failed to connect to" error in fetch_loop

Closed this issue · 5 comments

rb2k commented

I am not quite sure if this is a bug or just the 'normal' behavior and I have to raise/disable a timeout somewhere.

It seems like I can connect fine and receive published messages.
Once there are no messages for 10 seconds, an exception gets thrown.
This can be pushed out by just sending a new message to the topic (seems to reset the counter)

require 'poseidon_cluster'

consumer = Poseidon::ConsumerGroup.new(
            "rb2k-testing-group",                               # Group name
            ["some.machine.com:12333"], # Kafka brokers
            ["some.machine.com:12330"], # Zookeepers hosts
            "test"  # topic name
          )

puts "Initialized!"

puts "Partitions: #{consumer.partitions.inspect}"
puts "Claimed: #{consumer.claimed.inspect}"


consumer.fetch_loop do |partition, bulk|
  bulk.each do |m|
    puts "Fetched '#{m.value}' at #{m.offset} from #{partition}"
  end
end
$ ruby kafkatest_consumer.rb 
Initialized!
Partitions: [#<struct Poseidon::Protocol::PartitionMetadata error=0, id=0, leader=0, replicas=[0], isr=[0]>]
Claimed: [0]
Fetched 'value1' at 4 from 0
Fetched 'value1' at 5 from 0
/Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon-0.0.5/lib/poseidon/connection.rb:166:in `raise_connection_failed_error': Failed to connect to some.machine.com:12333 (Poseidon::Connection::ConnectionFailedError)
    from /Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon-0.0.5/lib/poseidon/connection.rb:123:in `rescue in read_response'
    from /Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon-0.0.5/lib/poseidon/connection.rb:113:in `read_response'
    from /Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon-0.0.5/lib/poseidon/connection.rb:76:in `fetch'
    from /Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon-0.0.5/lib/poseidon/partition_consumer.rb:108:in `fetch'
    from /Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon_cluster-0.3.0/lib/poseidon/consumer_group.rb:267:in `block in fetch'
    from /Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon_cluster-0.3.0/lib/poseidon/consumer_group.rb:236:in `block in checkout'
    from /Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon_cluster-0.3.0/lib/poseidon/consumer_group.rb:231:in `synchronize'
    from /Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon_cluster-0.3.0/lib/poseidon/consumer_group.rb:231:in `checkout'
    from /Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon_cluster-0.3.0/lib/poseidon/consumer_group.rb:266:in `fetch'
    from /Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon_cluster-0.3.0/lib/poseidon/consumer_group.rb:340:in `block in fetch_loop'
    from /Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon_cluster-0.3.0/lib/poseidon/consumer_group.rb:338:in `loop'
    from /Users/mseeger/.rvm/gems/ruby-2.1.5/gems/poseidon_cluster-0.3.0/lib/poseidon/consumer_group.rb:338:in `fetch_loop'
    from kafkatest_consumer.rb:17:in `<main>'
rb2k commented

I assume it has something to do with:

bpot/poseidon@63efab2

I had the same issue, too.

dpbus commented

Seeing the same issue. Were you able to find a fix or workaround?

rb2k commented

Sadly no, I was working on a POC, so I didn't investigate much further. Just thought it might be worth a github issue :)

On May 13, 2015, at 5:01 PM, David Busse notifications@github.com wrote:

Seeing the same issue. Were you able to find a fix or workaround?


Reply to this email directly or view it on GitHub.

dim commented

You can now specify a custom timeout in https://github.com/bsm/poseidon_cluster/blob/master/lib/poseidon/consumer_group.rb#L106. If timeouts are not the problem, please try to file an issue with https://github.com/bpot/poseidon. It looks like a poseidon connection issue, doesn't seem to be cluster related.