DataBricks imports `spark.implicits._` by default, is there a way to get around this?

Question

DataBricks imports `spark.implicits._` by default, is there a way to get around this?

honicky opened this issue 2 years ago · 1 comments

I'm new to scala, so please excuse me if I ask dumb questions.

It looks like the setup for DataBricks notebooks, which apparently imports some stuff (such as spark) on your behalf, does an import of spark.implicits._. This results in the error:

error: could not find implicit value for evidence parameter of type frameless.TypedEncoder[Array[Byte]]
val deserialize_proto_udf = ProtoSQL.udf { bytes: Array[Byte] => SubscribeResponse.parseFrom(bytes) }
                                         ^

I think that this is expected based on the documentation that states that bad things happen if we do that, and we should instead do ```
import scalapb.spark.Implicits._


I'm hoping there may be a way around this, in which I can remove the items from `spark.implicits._`, change the scope, or some other hack that will help me avoid this issue.

Please let me know if you have any ideas for workarounds.

Answer 1 · 2022-12-04T00:47:47.000Z

Hi @honicky , unfortunately the default imports that Databricks notebooks inject are taking precedence over ScalaPB's, so out of the box it doesn't work. One workaround, which is somewhat inconvenient, is to pass ScalaPB's encoders directly. You can access them through scalapb.spark.Implicits.xyz, and are defined here: https://github.com/scalapb/sparksql-scalapb/blob/a5ce8bc18f102ee4c4993d43656095b25086349e/sparksql-scalapb/src/main/scala/scalapb/spark/TypedEncoders.scala#L108