typelevel/frameless

spark-sql 3.1.2 can't work with frameless-dataset 0.11.1

timshen24 opened this issue · 3 comments

My sbt file:

name := "spark-frameless"

version := "0.0.1"

scalaVersion := "2.12.12"

libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.1.2"
libraryDependencies += "org.typelevel" %% "frameless-dataset" % "0.11.1"

libraryDependencies += "com.github.mrpowers" %% "spark-daria" % "1.0.0"
libraryDependencies += "com.github.mrpowers" %% "spark-fast-tests" % "1.0.0"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.1" % "test"

// test suite settings
fork in Test := true
javaOptions ++= Seq("-Xms512M", "-Xmx2048M", "-XX:MaxPermSize=2048M", "-XX:+CMSClassUnloadingEnabled")
// Show runtime of tests
testOptions in Test += Tests.Argument(TestFrameworks.ScalaTest, "-oDS")

It will complain

Exception in thread "main" java.lang.NoSuchMethodError: 'scala.collection.Seq org.apache.spark.sql.catalyst.expressions.objects.Invoke$.apply$default$5()'
at frameless.RecordEncoder.$anonfun$toCatalyst$2(RecordEncoder.scala:154)
RecordEncoder.scala:154
at scala.collection.immutable.List.map(List.scala:293)
List.scala:293
at frameless.RecordEncoder.toCatalyst(RecordEncoder.scala:153)
RecordEncoder.scala:153
at frameless.TypedExpressionEncoder$.apply(TypedExpressionEncoder.scala:28)
TypedExpressionEncoder.scala:28
at frameless.TypedDataset$.create(TypedDataset.scala:1248)
TypedDataset.scala:1248
at mrpowers.spark.frameless.DatasetCreator$.(DatasetCreator.scala:24)
DatasetCreator.scala:24
at mrpowers.spark.frameless.DatasetCreator$.(DatasetCreator.scala)
DatasetCreator.scala:7
at mrpowers.spark.frameless.Main$.delayedEndpoint$mrpowers$spark$frameless$Main$1(Main.scala:11)
Main.scala:11
at mrpowers.spark.frameless.Main$delayedInit$body.apply(Main.scala:10)
Main.scala:10
at scala.Function0.apply$mcV$sp(Function0.scala:39)
Function0.scala:39
at scala.Function0.apply$mcV$sp$(Function0.scala:39)
Function0.scala:39
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
AbstractFunction0.scala:17
at scala.App.$anonfun$main$1$adapted(App.scala:80)
App.scala:80
at scala.collection.immutable.List.foreach(List.scala:431)
List.scala:431
at scala.App.main(App.scala:80)
App.scala:80
at scala.App.main$(App.scala:78)
App.scala:78
at mrpowers.spark.frameless.Main$.main(Main.scala:10)
Main.scala:10
at mrpowers.spark.frameless.Main.main(Main.scala)

However when setting to

libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.1.0"
libraryDependencies += "org.typelevel" %% "frameless-dataset" % "0.10.1"

and keep the rest unchanged, everything works fine.

Do you guys know why, please?

Hi @timshen24, yes we know. Try the "org.typelevel" %% "frameless-dataset-spark31" % "0.11.1" dependency.

frameless-dataset is published against Spark 3.2.x which is binary incompatible with 3.1.x; however, since frameless 0.11 we publish artifacts cross scala as well as cross Spark versioned; artifacts for previous Spark version have the -spark{major}{minor} suffix. The mainline artifact for now would be always pointing to the most recent Spark version but we will try to publish it artifacts for the previous versions as well.

Check out the readme for more details https://github.com/typelevel/frameless#versions-and-dependencies. I will try to rewrite it in a way it is more obvious (: somewhere closer to the top

Thank you for the instant reply.

But setting libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.2.0" libraryDependencies += "org.typelevel" %% "frameless-dataset" % "0.11.1" still not works for me.

It will report compile error for not finding all SparkSession, DataFrame, DataSet class, etc.


libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.1.2"
libraryDependencies += "org.typelevel" %% "frameless-dataset-spark31" % "0.11.1"

The above can work for me right now.

@timshen24 3.2.0 didn't work for you due to other libraries spark versions mismatch. If you want to use Spark 3.2.0 review your deps i.e. spark-fast-tests should be of version2.3.0_0.11.0 and spark-daria does not support Spark 3.2 at all.

I'm closing this issue for now, since it is not a frameless issue anymore, but feel free to reopen it again if needed.