java.lang.NoSuchMethodError with scala/spark 2.10
Closed this issue · 10 comments
I'm getting the same error that raghugvt posted here. He solved the problem by bundeling everything together in one jar, however thats not an option as I would like to use spark-corenlp in a notebook.
My build.sbt is as follows:
version := "1.0"
scalaVersion := "2.10.6"
resolvers += "Spark Packages Repository" at "https://dl.bintray.com/spark-packages/maven/"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.10" % "2.1.0",
"org.apache.spark" % "spark-sql_2.10" % "2.1.0",
"com.databricks" % "spark-csv_2.10" % "1.5.0",
"org.apache.spark" % "spark-mllib_2.10" % "2.1.0"
)
libraryDependencies += "edu.stanford.nlp" % "stanford-corenlp" % "3.7.0" withSources() withJavadoc()
libraryDependencies += "edu.stanford.nlp" % "stanford-corenlp" % "3.7.0" classifier "models"
libraryDependencies += "databricks" % "spark-corenlp" % "0.2.0-s_2.11"
I'm testing with this script:
import org.apache.spark.sql.functions._
import com.databricks.spark.corenlp.functions._
import org.apache.spark.sql.SparkSession
val spark = SparkSession
.builder().master("local")
.appName("Spark SQL basic example")
.config("master", "spark://myhost:7077")
.getOrCreate()
val sqlContext = spark.sqlContext
import sqlContext.implicits._
val input = Seq(
(1, "<xml>Stanford University is located in California. It is a great university.</xml>")
).toDF("id", "text")
val output = input
.select(cleanxml('text).as('doc))
.select(explode(ssplit('doc)).as('sen))
.select('sen, tokenize('sen).as('words), ner('sen).as('nerTags), sentiment('sen).as('sentiment))
output.show(truncate = false)
Which results in the error:
java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaUniverse$JavaMirror;
at com.databricks.spark.corenlp.functions$.cleanxml(functions.scala:54)
What is going wrong here?
I am facing the exact same issue. Can someone please help out here?
@warrenronsiek were you able to resolve this issue?
@saurabh14rajput I ended up not using this library. Instead I created workaround with a UDF that wrapped the features of Stanford nlp that I wanted to use. Probably not super efficient or best practice - but it turns out to be relatively fast.
Okay. Thanks!
Hey, why do you include the corenlp in you dependencies? It is already done by spark-corenlp, see https://github.com/databricks/spark-corenlp/blob/master/build.sbt#L39
. Can you try to use version corenlp 3.6.0 and report back if you still have issues?
Hey, I think the issue is that you are mixing spark-corenlp 2.11
with Spark 2.10
. Your should replace the line
libraryDependencies += "databricks" % "spark-corenlp" % "0.2.0-s_2.11"
with
libraryDependencies += "databricks" % "spark-corenlp" % "0.2.0-s_2.10"
I'm getting the same error. I cloned this repo, ran sbt package
to build the jar, then invoked spark shell like this:
/opt/spark/spark-2.0.1/bin/spark-shell --jars ~/spark-corenlp_2.10-0.3.0-SNAPSHOT.jar
I get the error even if I specify library dependencies, like this:
/opt/spark/spark-2.0.1/bin/spark-shell --jars ~/spark-corenlp_2.10-0.3.0-SNAPSHOT.jar --packages databricks:spark-corenlp:0.2.0-s_2.10,edu.stanford.nlp:stanford-corenlp:3.7.0
No dice. Same error.
@zouzias using
libraryDependencies += "databricks" % "spark-corenlp" % "0.2.0-s_2.10"
solved the problem for the example I posted above. I cant speak for the other people who are getting the same error.