xerial/snappy-java

org.xerial.snappy.SnappyNative cannot be cast to org.xerial.snappy.SnappyNativeAPI

kingledion opened this issue · 2 comments

I'm getting a snappy error when I am trying to use unit testing in Maven through spark-testing-base. The error is only emerging during the unit tests, I have never had any sort of snappy errors when I'm spark-submitting jobs at the command line.

Dependencies are:

  • Spark 1.6.3
  • Scala 2.10.5
  • org.xerial.snappy:snappy-java:1.1.2.6
  • org.scalatest:scalatest_2.10:2.2.5
  • com.holdenkarau:spark-testing-base_2.10:1.6.3_0.5.0
  • Maven Surefire 2.12.4

Code causing error is a Maven test, run during test phase through JUnit:

@RunWith(classOf[JUnitRunner])
class HoldenkarauTest extends FunSuite with SharedSparkContext {
    test("test initializing spark context") {
      val list = List(1, 2, 3, 4)
      val rdd = sc.parallelize(list)

      assert(rdd.count === list.length)
    }
}

Error stack is below. Specific to the error stack, I feel like this is a version error; two different versions of snappy talking to each other? I don't see the SnappyNativeAPI class anymore, has it be deprecated in an older version? Any help would be appreciated.

test initializing spark context(com.pwc.kyv.HoldenkarauTest)  Time elapsed: 0.489 sec  <<< ERROR!
org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: java.lang.ClassCastException: org.xerial.snappy.SnappyNative cannot be cast to org.xerial.snappy.SnappyNativeAPI
org.xerial.snappy.Snappy.maxCompressedLength(Snappy.java:320)
org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:97)
org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:89)
org.apache.spark.io.SnappyCompressionCodec.compressedOutputStream(CompressionCodec.scala:156)
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$4.apply(TorrentBroadcast.scala:200)
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$4.apply(TorrentBroadcast.scala:200)
scala.Option.map(Option.scala:145)
org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:200)
org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102)
org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:85)
org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63)
org.apache.spark.SparkContext.broadcast(SparkContext.scala:1326)
org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1006)
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921)
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861)
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607)
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
        at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1016)
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921)
        at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
        at org.apache.spark.rdd.RDD.count(RDD.scala:1157)
        at com.pwc.kyv.HoldenkarauTest$$anonfun$1.apply$mcV$sp(HoldenkarauTest.scala:14)
        at com.pwc.kyv.HoldenkarauTest$$anonfun$1.apply(HoldenkarauTest.scala:10)
        at com.pwc.kyv.HoldenkarauTest$$anonfun$1.apply(HoldenkarauTest.scala:10)
        at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
        at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
        at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
        at org.scalatest.Transformer.apply(Transformer.scala:22)
        at org.scalatest.Transformer.apply(Transformer.scala:20)
        at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
        at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
        at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
        at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
        at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
        at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
        at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
        at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
        at org.scalatest.FunSuite.runTest(FunSuite.scala:1555)
        at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
        at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208)
        at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
        at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
        at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
        at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
        at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208)
        at org.scalatest.FunSuite.runTests(FunSuite.scala:1555)
        at org.scalatest.Suite$class.run(Suite.scala:1424)
        at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555)
        at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
        at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212)
        at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
        at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212)
        at com.pwc.kyv.HoldenkarauTest.org$scalatest$BeforeAndAfterAll$$super$run(HoldenkarauTest.scala:9)
        at org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:257)
        at org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:256)
        at com.pwc.kyv.HoldenkarauTest.run(HoldenkarauTest.scala:9)
        at org.scalatest.junit.JUnitRunner.run(JUnitRunner.scala:99)
        at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
        at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
        at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
        at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
Caused by: java.lang.ClassCastException: org.xerial.snappy.SnappyNative cannot be cast to org.xerial.snappy.SnappyNativeAPI
        at org.xerial.snappy.Snappy.maxCompressedLength(Snappy.java:320)
        at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:97)
        at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:89)
        at org.apache.spark.io.SnappyCompressionCodec.compressedOutputStream(CompressionCodec.scala:156)
        at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$4.apply(TorrentBroadcast.scala:200)
        at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$4.apply(TorrentBroadcast.scala:200)
        at scala.Option.map(Option.scala:145)
        at org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:200)
        at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:102)
        at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:85)
        at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
        at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63)
        at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1326)
        at org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1006)
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921)
        at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:861)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1607)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

I’ve never seen this type of error before, but I guess it’s a class loder issue of your testing framework or the combination of frameworks.

And also Spark 1.6.3, Scala 2.10 are prettey old versions, so I recommend using latest versions such as Spark 2.4, Scala 2.11 (or Scala 2.12 for Spark 2.4)

Thanks for the feedback. It was indeed a class loader issue. One of the third party dependencies we were using was an uber jar with all sorts of unbelievable stuff in there, including some old versions of snappy-java. Once we stripped that jar down to the bone, the problems resolved themselves. Sorry to clog up here with a "not your issue".