Azure/spark-cdm-connector

Databricks Spark 3.x "doesn't work" / java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/ReadSupport

Closed this issue · 3 comments

Hi @asksparkcdm@microsoft.com
I am from Linkedin, we are having compatibility issue with spark-cdm-connector, to give a little context I have a cdm data in ADLS which I’m trying to read into Databricks 9.1 LTS Apache Spark 3.1.2, Scala 2.12, I have installed com.microsoft.azure:spark-cdm-connector:0.19.1 and org.neo4j:neo4j-connector-apache-spark_2.12:4.1.2_for_spark_3, I tried multiple versions of neo4j-connector and spark-cdm-connector as well and its throwing this error
ERROR: java.lang.NoClassDefFoundError:org/apache/spark/sql/sources/v2/ReadSupport.

Can you please provide proper version of spark-cdm-connector and neo4j-connector compatible for Databricks 9.1 LTS Apache Spark 3.1.2, Scala 2.12 from these below available versions?
spark-cdm-connector versions available in Databricks
CDM-connector

neo4j-connector available in Databricks
Neo4j

Description automatically generated

ERROR context.
Screen Shot 2022-05-12 at 4 23 41 PM

Description automatically generated

Pavan

I am also facing similar issue while using 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12)

The jar version 0.19.1 is for Spark 2.4 and you are using Spark 3, therefore giving class not found exception. The class ReadSupport is only found in Spark 2.4.x.

CDM Version Spark Version
0.x.x 2.4.x
1.x.x 3.1.x

If you want to use Databricks then, you need to build the jar. And with that only app registration works, but not credential passthrough. We haven't heard any workaround for credential passthrough with Databricks so the code is open sourced for contributions.

Added to pinned issues #118