dotnet/spark

[BUG]: Sample From Readme Not Running Ubuntu 18.04

lqdev opened this issue · 2 comments

lqdev commented

Describe the bug
Running On Ubuntu 18.04 the following output / error appears:

19/04/25 06:39:40 WARN Utils: Your hostname, localhost resolves to a loopback address: 127.0.0.1; using <MY-PUBLIC-IP> instead (on interface eth0)
19/04/25 06:39:40 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
19/04/25 06:39:41 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/internal/Logging$class
	at org.apache.spark.deploy.DotnetRunner$.<init>(DotnetRunner.scala:34)
	at org.apache.spark.deploy.DotnetRunner$.<clinit>(DotnetRunner.scala)
	at org.apache.spark.deploy.DotnetRunner.main(DotnetRunner.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.internal.Logging$class
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 15 more
log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

To Reproduce

Using this code in Program.cs:

using System;
using Microsoft.Spark.Sql;
using static Microsoft.Spark.Sql.Functions;

namespace HelloSpark
{
    class Program
    {
        static void Main(string[] args)
        {
            var spark = SparkSession.Builder().GetOrCreate();
            var df = spark.Read().Json("people.json");
            df.Show();
        }
    }
}

and this data in file people.json:

{"name":"Michael"} 
{"name":"Andy", "age":30} 
{"name":"Justin", "age":19} 

After entering the following command in the terminal

spark-submit \
--class org.apache.spark.deploy.DotnetRunner \
--master local \
./bin/Debug/netcoreapp2.1/linux-x64/publish/microsoft-spark-2.4.x-0.1.0.jar \
./bin/Debug/netcoreapp2.1/linux-x64/publish/HelloSpark

Expected behavior
No errors.

Desktop (please complete the following information):

  • OS: Ubuntu
  • Version 18.04

Additional context

See draft publish in #13 to see steps used to set up environment. Spark and Java are confirmed working.

Can you please try spark v2.4.0 or 2.4.1? 2.4.2 is just released two days ago and the support is not in yet.

lqdev commented

2.4.1 Works. Thanks!