dfdx/Spark.jl

SparkContext giving StackOverflowError

Closed this issue · 5 comments

I'm following the basic steps in the tutorial and found some issues to load the SparkContext.
Btw, the Spark.init() only works if set JULIA_COPY_STACKS=1. I think is good to clarify to the others in the documentation.

Setup

Apache Maven 3.6.3
Maven home: /usr/share/maven
Java version: 11.0.9.1, vendor: Ubuntu, runtime: /usr/lib/jvm/java-11-openjdk-amd64
Default locale: en_US, platform encoding: ANSI_X3.4-1968
OS name: "linux", version: "4.4.0-1112-aws", arch: "amd64", family: "unix"

spark-3.0.1-bin-hadoop3.2

Using Scala version 2.12.10 (OpenJDK 64-Bit Server VM, Java 11.0.9.1)

Code

app# JULIA_COPY_STACKS=1 julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.4.1 (2020-04-14)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using Spark

julia> Spark.init()

julia> sc = SparkContext(master="local")

Error

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/11/14 23:43:06 WARN Utils: Your hostname, ip-10-202-48-234 resolves to a loopback address: 127.0.0.1; using 10.202.48.234 instead (on interface ens3)
20/11/14 23:43:06 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/opt/spark/spark-3.0.1-bin-hadoop3.2/jars/spark-unsafe_2.12-3.0.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
20/11/14 23:43:06 INFO SparkContext: Running Spark version 3.0.1
Exception in thread "process reaper" java.lang.StackOverflowError
        at java.base/java.lang.invoke.MethodType$ConcurrentWeakInternSet$WeakEntry.equals(MethodType.java:1341)
        at java.base/java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:940)
        at java.base/java.lang.invoke.MethodType$ConcurrentWeakInternSet.get(MethodType.java:1279)
        at java.base/java.lang.invoke.MethodType.makeImpl(MethodType.java:300)
        at java.base/java.lang.invoke.MethodTypeForm.canonicalize(MethodTypeForm.java:355)
        at java.base/java.lang.invoke.MethodTypeForm.findForm(MethodTypeForm.java:317)
        at java.base/java.lang.invoke.MethodType.makeImpl(MethodType.java:315)
        at java.base/java.lang.invoke.MethodType.insertParameterTypes(MethodType.java:410)
        at java.base/java.lang.invoke.VarHandle$AccessDescriptor.<init>(VarHandle.java:1853)
        at java.base/java.lang.invoke.MethodHandleNatives.varHandleOperationLinkerMethod(MethodHandleNatives.java:518)
        at java.base/java.lang.invoke.MethodHandleNatives.linkMethodImpl(MethodHandleNatives.java:462)
        at java.base/java.lang.invoke.MethodHandleNatives.linkMethod(MethodHandleNatives.java:450)
        at java.base/java.util.concurrent.CompletableFuture.completeValue(CompletableFuture.java:305)
        at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2072)
        at java.base/java.lang.ProcessHandleImpl$1.run(ProcessHandleImpl.java:162)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
dfdx commented

The most likely case is using Spark 3, which we didn't test against yet. Can you try it with Spark 2.4?

Good point @dfdx
I changed the version to this one: https://downloads.apache.org/spark/spark-2.4.7/spark-2.4.7-bin-hadoop2.7.tgz

Now I`m getting:

julia> using Spark

julia> Spark.init()
ERROR: ArgumentError: invalid index: nothing of type Nothing
Stacktrace:
 [1] to_index(::Nothing) at ./indices.jl:297
 [2] to_index(::Array{String,1}, ::Nothing) at ./indices.jl:274
 [3] to_indices at ./indices.jl:325 [inlined]
 [4] to_indices at ./indices.jl:322 [inlined]
 [5] getindex at ./abstractarray.jl:980 [inlined]
 [6] load_spark_defaults(::Dict{Any,Any}) at /root/.julia/packages/Spark/3MVGw/src/init.jl:55
 [7] init() at /root/.julia/packages/Spark/3MVGw/src/init.jl:5
 [8] top-level scope at REPL[2]:1

What is your setup?

dfdx commented

I'm using Julia 1.5 and all the default settings, which result in Spark 2.4.7.

How do you set up Spark version? Do you use SPARK_CONF or SPARK_HOME environment variables for this?

exyi commented

I ran into the same problem with StackOverflowException and the problem was solved by switching to Java 8. I think there should be a warning when newer Java is used by accident, it was quite hard to find what caused the problems 😅

dfdx commented

Closing as outdated.