MR fail cause by export SPARK_CLASSPATH

Question

MR fail cause by export SPARK_CLASSPATH

AllenFang opened this issue 9 years ago · 15 comments

Hi, this is cool stuff for spark sql with HBase, however I've some issue or problem as follow:

I've installed your product follow by document and it's all work well currently. But I write a very simple Spark application for query HBase table using newAPIHadoopRDD but got these error:

Application application_1439169262151_0037 failed 2 times due to AM Container for appattempt_1439169262151_0037_000002 exited with  exitCode: 127 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: 
org.apache.hadoop.util.Shell$ExitCodeException: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
    at org.apache.hadoop.util.Shell.run(Shell.java:418)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:114)
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59)
    at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
    at org.apache.spark.SparkContext.(SparkContext.scala:497)
    at com.hbase.HBaseQueryWithRDD$.main(HBaseQueryWithRDD.scala:18)
    at com.hbase.HBaseQueryWithRDD.main(HBaseQueryWithRDD.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2015-08-12 10:15:53,360 INFO  [main] scheduler.DAGScheduler (Logging.scala:logInfo(59)) - Stopping DAGScheduler
2015-08-12 10:15:53,362 ERROR [main] spark.SparkContext (Logging.scala:logError(96)) - Error stopping SparkContext after init error.
java.lang.NullPointerException
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.stop(YarnClientSchedulerBackend.scala:150)
    at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:416)
    at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1404)
    at org.apache.spark.SparkContext.stop(SparkContext.scala:1642)
    at org.apache.spark.SparkContext.(SparkContext.scala:565)
    at com.hbase.HBaseQueryWithRDD$.main(HBaseQueryWithRDD.scala:18)
    at com.hbase.HBaseQueryWithRDD.main(HBaseQueryWithRDD.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Exception in thread "main" org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:114)
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59)
    at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
    at org.apache.spark.SparkContext.(SparkContext.scala:497)
    at com.hbase.HBaseQueryWithRDD$.main(HBaseQueryWithRDD.scala:18)
    at com.hbase.HBaseQueryWithRDD.main(HBaseQueryWithRDD.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

But If I remove the spark-sql-on-hbase-1.0.0.jar from SPARK_CLASSPATH, the job will pass.

My spark version is 1.4.0 and Hadoop is 2.3

Answer 1 · 2015-08-12T04:23:16.000Z

Can you check Yarn's log to see why the Application Master can't be launched, which seems to be the root cause of your exception?

Answer 2 · 2015-08-12T04:39:27.000Z

HI @yzhou2001, the error message that I provided is already from yarn logs and first error message is

Application application_1439169262151_0037 failed 2 times due to AM Container for appattempt_1439169262151_0037_000002 exited with  exitCode: 127 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: 
org.apache.hadoop.util.Shell$ExitCodeException: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
    at org.apache.hadoop.util.Shell.run(Shell.java:418)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

the message just like I post as above.

Answer 3 · 2015-08-12T17:06:47.000Z

Allen,

You say hadoop-yarn v2.3.0 is included in your spark v1.4.0 shaded jar

What version of hadoop-yarn is included in the spark-sql-on-hbase-1.0.0.jar you removed from $SPARK_CLASSPATH ? Is it also v2.3.0?

And can you post your spark program (driver) to help us reproduce the problem?

Thanks,
Stan

Answer 4 · 2015-08-13T02:39:16.000Z

I use this command to package the jar, but I'm not sure is it correct?

mvn clean package -Phbase,hadoop-2.3 -DskipTests

So how to package the jar with a hadoop version?

And my driver program in the below

val tableName = "XXX";
val conf = new SparkConf().setAppName("HBase_Query_with_RDD");
val sc   = new SparkContext(conf);
    
val hbaseConf = HBaseConfiguration.create();
hbaseConf.set("hbase.zookeeper.quorum","server-a1")
hbaseConf.set("hbase.zookeeper.property.clientPort","2181")
hbaseConf.set("mapreduce.framework.name", "yarn")
hbaseConf.set("yarn.resourcemanager.address", "server-a1:8032")
hbaseConf.set("yarn.resourcemanager.scheduler.address", "server-a1:8030")
hbaseConf.set("yarn.resourcemanager.resource-tracker.address", "server-a1:8031");
hbaseConf.set("yarn.resourcemanager.admin.address", "server-a1:8033")
hbaseConf.set(TableInputFormat.INPUT_TABLE, tableName)
    
var table = new HTable(hbaseConf, tableName)
val hbaseRDD = sc.newAPIHadoopRDD(hbaseConf, classOf[TableInputFormat], 
                                classOf[ImmutableBytesWritable], 
                                classOf[Result])
    
println("contain result: " + hbaseRDD.count())
table.close()
sc.stop()

Answer 5 · 2015-08-13T02:43:15.000Z

And I give the yarn message for more detail, hope it will be helpful for you.

2015-08-13 10:25:28,434 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1439169262151_0057_000002 State change from FINAL_SAVING to FAILED
2015-08-13 10:25:28,434 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating application application_1439169262151_0057 with final state: FAILED
2015-08-13 10:25:28,434 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1439169262151_0057 State change from ACCEPTED to FINAL_SAVING
2015-08-13 10:25:28,434 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application appattempt_1439169262151_0057_000002 is done. finalState=FAILED
2015-08-13 10:25:28,435 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1439169262151_0057
2015-08-13 10:25:28,435 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1439169262151_0057 requests cleared
2015-08-13 10:25:28,435 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1439169262151_0057 user: user1 queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2015-08-13 10:25:28,435 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application removed - appId: application_1439169262151_0057 user: user1 leaf-queue of parent: root #applications: 0
2015-08-13 10:25:28,435 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1439169262151_0057 failed 2 times due to AM Container for appattempt_1439169262151_0057_000002 exited with  exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
        at org.apache.hadoop.util.Shell.run(Shell.java:418)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
.Failing this attempt.. Failing the application.
2015-08-13 10:25:28,435 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1439169262151_0057 State change from FINAL_SAVING to FAILED
2015-08-13 10:25:28,435 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=user1    OPERATION=Application Finished - Failed TARGET=RMAppManager     RESULT=FAILURE  DESCRIPTION=App failed with state: FAILED       PERMISSIONS=Application application_1439169262151_0057 failed 2 times due to AM Container for appattempt_1439169262151_0057_000002 exited with  exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
        at org.apache.hadoop.util.Shell.run(Shell.java:418)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

Answer 6 · 2015-08-14T01:15:08.000Z

Hi Allen,

I just wanted to make sure you did not have hadoop-yarn version conflicts. I don't think you do, and you packaged the jar correctly.

You do not need to run your Yarn container to create a NewHadoopRDD with Spark-SQL-on-HBase.

Here is a simple example that works in my environment.

The data set

1,xiaoming,16,id_1,teacherW
2,xiaoming,16,id_2,teacherW
3,xiaoming,16,id_3,teacherW
4,xiaoming,16,id_4,teacherW
5,xiaoming,16,id_5,teacherW
6,xiaoming,16,id_6,teacherW
7,xiaoming,16,id_7,teacherW
8,xiaoming,16,id_8,teacherW
9,xiaoming,16,id_9,teacherW
10,xiaoming,16,id_10,teacherW
11,xiaoming,16,id_11,teacherW
12,xiaoming,16,id_12,teacherW
13,xiaoming,16,id_13,teacherW
14,xiaoming,16,id_14,teacherW
15,xiaoming,16,id_15,teacherW
16,xiaoming,16,id_16,teacherW
17,xiaoming,16,id_17,teacherW
18,xiaoming,16,id_18,teacherW
19,xiaoming,16,id_19,teacherW
1001,lihua,20,A1000,
1002,lihua,20,A1000,

column-family='cf'
rowkey:string
columns:datatype -> a:string, b:string, c:string, d:string (col 'd' is nullable)

The driver

package org.apache.spark.sql.hbase

import org.apache.hadoop.hbase.client.{HTable, Result}
import org.apache.hadoop.hbase.io.ImmutableBytesWritable
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.hadoop.hbase.util.Bytes
import org.apache.hadoop.hbase.{Cell, CellUtil}
import org.apache.spark.rdd.RDD
import org.apache.spark.{SparkConf, SparkContext}

object NewHadoopRDDExample {

  def main(args: Array[String]) {
    println("NewHadoopRDDExample")

    val sparkHome = System.getenv("SPARK_HOME")
    val tableName = "PEOPLE"
    val sparkConf = new SparkConf(true)
      .setMaster("local[2]")
      .setAppName("NewHadoopRDDExample")
      .set("spark.executor.memory", "1g")

    val sc = new SparkContext(sparkConf)

    val hbaseContext = new org.apache.spark.sql.hbase.HBaseSQLContext(sc)
    val hbaseConf = hbaseContext.sparkContext.hadoopConfiguration
    hbaseConf.set("fs.defaultFS", "hdfs://YOUR-NAMENODE:54310")
    hbaseConf.set("hbase.zookeeper.quorum", "YOUR-ZK-CNXN-STRING")
    hbaseConf.set(TableInputFormat.INPUT_TABLE, tableName)

    var table = new HTable(hbaseConf, tableName)

    val hbaseRDD = sc.newAPIHadoopRDD(
      hbaseConf,
      classOf[TableInputFormat],
      classOf[ImmutableBytesWritable],
      classOf[Result])
    println("HBase RDD Count: " + hbaseRDD.count)


    println("\nHBase KeyValues:")
    hbaseRDD.foreach(println)

    // Have to map ImmutableBytesWritables to serializable objects before running rdd.collect
    val cellsRDD: RDD[(String, Array[String])] = hbaseRDD.map(x => x._2).map(result => {
      val rowkey = result.getRow
      val col1: Cell = result.getColumnLatestCell(Bytes.toBytes("cf"), Bytes.toBytes("a"))
      val col2: Cell = result.getColumnLatestCell(Bytes.toBytes("cf"), Bytes.toBytes("b"))
      val col3: Cell = result.getColumnLatestCell(Bytes.toBytes("cf"), Bytes.toBytes("c"))
      val col4: Cell = result.getColumnLatestCell(Bytes.toBytes("cf"), Bytes.toBytes("d"))

      val arr = new Array[String](4)
      arr(0) = (Bytes.toStringBinary(CellUtil.cloneValue(col1)))
      arr(1) = (Bytes.toStringBinary(CellUtil.cloneValue(col2)))
      arr(2) = (Bytes.toStringBinary(CellUtil.cloneValue(col3)))
      // col 'd' is nullable
      arr(3) = if (col4 != null) (Bytes.toStringBinary(CellUtil.cloneValue(col4))) else null
      (Bytes.toStringBinary(rowkey), arr)
    })

    println("\nDeserialized Rows:")
    val tuples = cellsRDD.collect
    for (i <- 0 until tuples.length) {
      print("Row: " + tuples(i)._1)
      print(" => " + tuples(i)._2.mkString(" | "))
      println
    }

    table.close()
  }
}

Answer 7 · 2015-08-14T01:45:58.000Z

Sorry, I dont know what actually means about running your Yarn container to create a NewHadoopRDD with Spark-SQL-on-HBase.

Anyway if I just write a spark application to read a HDFS file and counting the rows, the error message is same.

Answer 8 · 2015-08-14T19:58:42.000Z

Your driver's configuration assumes a running Yarn container:
hbaseConf.set("mapreduce.framework.name", "yarn")
hbaseConf.set("yarn.resourcemanager.address", "server-a1:8032")
hbaseConf.set("yarn.resourcemanager.scheduler.address", "server-a1:8030")
hbaseConf.set("yarn.resourcemanager.resource-tracker.address", "server-a1:8031");
hbaseConf.set("yarn.resourcemanager.admin.address", "server-a1:8033")

Can you run the driver example I posted, using Spark-SQL-on-HBase?

You will see no references to Yarn in the working example's configuration, and you should have no Yarn container start-up or connection errors because Spark-SQL-on-HBase will not try to submit a job to Yarn.

Answer 9 · 2015-08-17T02:15:38.000Z

Hi @sparksburnitt, yeah your are right, if I use your sample with Spark-SQL-on-HBase, it's work. Thanks a lots. But just like I said before, a very simple spark application running on yarn with error cause by starting application master failed still exist. ;(

Answer 10 · 2015-08-17T19:36:02.000Z

Hello Allen,

I was able to run your Spark (only) code with spark-sql-on-hbase-1.0.0.jar in the SPARK_CLASSPATH without a problem.

You do not need those yarn property settings in your HBase config object.

Why don't you try it again after replacing the yarn properties below with your 'fs.defaultFS' value.
/*
hbaseConf.set("mapreduce.framework.name", "yarn")
hbaseConf.set("yarn.resourcemanager.address", "server-a1:8032")
hbaseConf.set("yarn.resourcemanager.scheduler.address", "server-a1:8030")
hbaseConf.set("yarn.resourcemanager.resource-tracker.address", "server-a1:8031")
hbaseConf.set("yarn.resourcemanager.admin.address", "server-a1:8033")
*/

// Spark needs to know where your hdfs root is:
hbaseConf.set("fs.defaultFS", "hdfs://YOUR-NAMENODE:PORT")

Let me know if you still see yarn errors.

-Stan

Answer 11 · 2015-08-18T01:36:21.000Z

Hi @sparksburnitt , I've already tried it but result is same. I forgot to talk you, very sorry.

Answer 12 · 2015-08-18T17:02:53.000Z

Hi Allen,

I just ran the example with your yarn settings (adjusted for my environment) and still could not reproduce the errors.

How are you submitting the job?

Are you submitting a hadoop map reduce job from the command line, using the 'hadoop' shell script?
Are you submitting a spark job using spark-submit?
Are you running a scala script from the spark shell?
A spark driver executing in your IDE?

(I have been running these examples from a scala object in an IDE.)

Answer 13 · 2015-08-18T18:15:55.000Z

Hi Allen,

I ran the scala script below in the spark-shell (v1.4.0), and could not reproduce the errors.

Does it work on your cluster?

...

import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.client.{HTable, Result}
import org.apache.hadoop.hbase.io.ImmutableBytesWritable
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.spark.{SparkConf, SparkContext}

val tableName = "XXXX_TBL"

val hbaseConf = HBaseConfiguration.create
hbaseConf.set("fs.defaultFS", "hdfs://SERVER:54310")
hbaseConf.set("hbase.zookeeper.quorum", "SERVER:2181")

hbaseConf.set("mapreduce.framework.name", "yarn")
hbaseConf.set("yarn.resourcemanager.address", "SERVER:8032")
hbaseConf.set("yarn.resourcemanager.scheduler.address", "SERVER:8030")
hbaseConf.set("yarn.resourcemanager.resource-tracker.address", "SERVER:8025")
hbaseConf.set("yarn.resourcemanager.admin.address", "SERVER:8033")

hbaseConf.set(TableInputFormat.INPUT_TABLE, tableName)

val table = new HTable(hbaseConf, tableName)

val hbaseRDD = sc.newAPIHadoopRDD(
hbaseConf,
classOf[TableInputFormat],
classOf[ImmutableBytesWritable],
classOf[Result])

println("HBase RDD Count: " + hbaseRDD.count)
table.close

Answer 14 · 2015-08-19T02:48:24.000Z

Hi @sparksburnitt, I always run my application by spark-submit and use yarn-client. But I do the following test inspired by you.

I've written a very simple code with scala and package a jar

...
val input = sc.parallelize(Array(1,2,3,4,5))
println(input.count())
sc.stop()

Run by spark-submit with yarn-client. The result is fail, the problem is same.
Run by spark-submit with yarn-cluster. The result is success.
Run by spark-submit with local. The result is success.
use spark-shell to run the code
The result is success, but I think if running on spark-shell, it always run on local, it does not to use yarn to run this job, so always does not happen the error about the starting application master

Anyway, I also run the code that you provide above, the result is ok !! But I have no idea about why the application running only on yarn-client will cause error. But I think I dont want to solve this problem temporary, because run on yarn-cluster is work, so I can keep to work. However, thanks for your help, very sorry to spent you so much times.

Answer 15 · 2015-08-19T03:30:45.000Z

Hello Allen,

I was able to run your job in yarn-client mode via spark-submit.

In the driver, I changed the spark-conf's master to 'yarn-client', e.g.

val sparkConf = new SparkConf(true)
.setMaster("yarn-client")
.setAppName("SparkOnYarnExample")
.set("spark.executor.memory", "4g")

and built a 'fang.jar' containing the spark driver.

Here is the spark-submit command:

export SPARK_JAR=YR_PATH/spark-assembly-1.4.0-hadoop2.4.0.jar
export SPARK_SQL_HBASE_JAR=YR_PATH/spark-sql-on-hbase-1.0.0.jar

bin/spark-submit --class org.apache.spark.sql.hbase.SparkOnYarnExample
--master yarn-client
--jars $SPARK_SQL_HBASE_JAR
--num-executors 6
--driver-memory 4g
--executor-memory 4g
--executor-cores 6 \
/tmp/fang.jar