holdenk/spark-testing-base

Use same Hadoop version as spark target?

Opened this issue · 1 comments

I have had good success with this library but spent quite a bit of time fighting versioning mismatches between the hadoop libraries spark wants and the libraries included in this tool. I think this could clear up a lot of the dependency issues that many of us face. In short we would need a mapping of spark version -> hadoop version and include that in the sbt file like you do for the source files.

I am guessing there are reasons why 2.8.3 might be preferred for testing but it makes dependency management a real pain when using spark 2.2 or 2.3. Does the core functionality depend on anything in 2.8.3 or is one of the more advanced features like MiniCluster?

Any chance this could be addressed for the 3.X versions?