pingles/clj-kafka

Upgrading to Kafka 0.8?

Closed this issue · 6 comments

What is needed? Can we help?

xpe commented

I have seen active discussion about using Storm with Kafka 0.8 on the storm-user mailing list thread titled storm-kafka for Kafka 0.8?. I'm not yet sure of what came out of that discussion.

We (uSwitch) are looking at setting up a new 0.8 cluster and migrating over. As part of this I've configured a Continuous Integration build against the 0.8 branch of Kafka. The CI machine is publishing SNAPSHOT jars to Clojars for now- we'll probably get around to starting a 0.8 branch of clj-kafka later this afternoon/tomorrow.

The SNAPSHOT'ed build is available here: https://clojars.org/com.uswitch/kafka_2.9.2

We're building it for use with Scala 2.9.2. I've not tried using the built library yet but I'm posting here in case anyone wants to jump in before we do.

Hi,

We (@Quantisan and I) are in the process of updating the code to work with 0.8. There's a few changes that change things so although I'll try and make sure we've got a similar API it's possible we may need to change things around a little more (messages can now be keyed during partitioning etc.). Most notable the SimpleConsumer API seems to have morphed a little; I'm not too worried about it as most consumers should be using the Zookeeper route.

You can see our progress in the 0.8 branch: https://github.com/pingles/clj-kafka/tree/0.8

Of note: we've upgraded to Clojure 1.5.1, we've added a brokers function to the producer ns to help locate brokers (we're migrating our cluster to EC2 so this will be used to dynamically track brokers), and we've spent some time getting an embedded Zookeeper and Kafka server so we can write some automated end to end tests.

Also of note in the Zookeeper consumer is we no longer to-clojure messages in https://github.com/pingles/clj-kafka/blob/0.8/src/clj_kafka/consumer/zk.clj#L31. Whilst writing the tests we found that map'ing across the results would cause the sequence to block- I'll be looking more at that tomorrow but just returning the underlying message seems fine.

So, there's a little bit to go but I've pushed an early SNAPSHOT release to Clojars:

[clj-kafka/clj-kafka "0.1.0-0.8-SNAPSHOT"]

If you could kick the tyres, check the API changes and see how much of a problem it is. The biggest change is https://github.com/pingles/clj-kafka/blob/0.8/src/clj_kafka/consumer/zk.clj#L31; we've changed it so you consume messages from a single topic- we had some problems getting the code updated to fetch from multiple topics and interleave them into a single sequence. I'll have another look but was curious if this was something other people were doing much of?

xpe commented

Thanks, I'll try to check that out soon!

I've updated the notes above, I'm going to start adding some issues for the things we're yet to resolve and label them under 0.8.

I'm going to close this and move the items into 0.8 labelled issues.