sakserv/hadoop-mini-clusters

HBase mini cluster fat jar

Closed this issue · 6 comments

Mini-cluster for HBase use Shade plugin for preparing the artifact. It is packing humongous amount of libs inside that artifact without actual relocating of that packages.
Is it done on purpose?

@isendel - thanks for reaching out.

This was done on purpose, but perhaps there are improvements that could be made. The issue here is that storm-core and hbase-testing-utils both leverage the lmax disruptor library, but incompatible versions. Without relocating the lmax package, it would break the ability to test storm topologies that leverage a HBase spout/bolt.

Let me know your thoughts on better ways we could approach this while trying to maintain backwards compatibility?

Thanks!

I commonly agree on lmax relocating, shade plugin fits that perfectly fine to ensure backward compatibility. My only concern is about the rest of the dependencies and transitive dependencies included into final artifact producing a fat jar. Whole bunch of commonly used classes from libraries like Guava, Jackson and many others are inside that *-hbase jar.
If LMAX dependency is the only one you need to shade, I would recommend to exclude the reset of that packages in shade plugin.

@isendel - I have just pushed a commit that significantly reduces the size of the jar. Unfortunately, I couldn't find a clean way to express the excludes to the shade plugin other than a list. If you know of a cleaner approach, please do let me know.

Let me know what you think and I can include this in the upcoming release. Thanks!

@sakserv Great, thank you. I've made a PR that does same but in more concise way. Please check it out.

Thanks for the feedback and PR. It has been merged. In will include this in the upcoming release. (FYI I'm aware of the build failures and working towards moving to circleci)

@sakserv Thank you so much and appreciate your work on the project. It's really helpful.