fingltd/4mc

hadoop-4mc with AWS EMR?

Closed this issue · 2 comments

Hey Carlo!

Great work on this repository; I'm very excited about the potential.

I'm spinning up some AWS EMR clusters for a production workflow and I'm hoping to incorporate your compression. I'm curious about whether or not there would be an easy way to configure AWS EMR to pull in the hadoop-4mc library when it's started, since it'll be a bit of a pain to go in after the fact and install the library across the cluster.

Do you have any advice or suggestions for how to implement hadoop-4mc on an AWS EMR cluster, and if so could you add it to your documentation?

Cheers and thanks in advance!

I'm new to this space and naive and after a day or so of googling I figured out more or less how to do this. I'll probably submit a little PR with some minor additions to the documentation to make installation trivial for a totally new (and braindead) user like myself. ;)

Cheers!

Thanks, with latest feature of embedded native libs it should work flawlessly aa long as you add it to your jobs cached libs.

Let me know and for sure I need to find time not only for better doc but also to put on same real example on Hadoop MR and spark and flink.