databricks/sbt-spark-package

Accessing the ml Spark component (without importing all the mllib stuff)

MrPowers opened this issue · 0 comments

I'm using the latest version of the plugin and am able to get access to the ml package with this line of code:

sparkComponents ++= Seq("sql", "hive", "mllib")

I thought that this would work:

sparkComponents ++= Seq("sql", "hive", "ml")

It seems like this line of code should handle the "ml" argument just fine:

sparkComponentSet.map { component =>
  "org.apache.spark" %% s"spark-$component" % sparkVersion.value % "provided"
}.toSeq

Here's the error message I get when I use sparkComponents ++= Seq("sql", "hive", "ml"): sbt.ResolveException: unresolved dependency: org.apache.spark#spark-ml_2.11;2.1.0: not found

I don't really want to import all the "mllib" code, just the "ml" code. Thanks for the help.