melin/spark-jobserver

Can not get Hadoop Configuration of Test

Opened this issue · 2 comments

2023-04-14 10:32:46,511 INFO i.g.m.s.j.s.ClusterManager [check-yarn-app-0] 当前yarn 集群可用内存: 5074MB, 最小需要内存: 4096, 可以调整参数(单位MB): jobserver.yarn.min.memory.mb
2023-04-14 10:32:46,511 INFO i.g.m.s.j.s.ClusterManager [check-yarn-app-0] 当前yarn 集群可用CPU数量: 4143, 最小需要CPU: 5, 可以调整参数: jobserver.yarn.min.cpu.cores
2023-04-14 10:32:46,515 INFO i.g.m.s.j.d.AbstractDriverDeployer [check-yarn-app-0] Get redis lock
2023-04-14 10:32:46,538 INFO i.g.m.s.j.d.YarnSparkDriverDeployer [check-yarn-app-0] 预启动 driver Id: 309
2023-04-14 10:32:46,538 INFO i.g.m.s.j.d.AbstractDriverDeployer [check-yarn-app-0] sparkHome: /opt/apps/SPARK3/spark-3.3.0-hadoop2.8-1.0.0
2023-04-14 10:32:46,545 INFO i.g.m.s.j.d.YarnSparkDriverDeployer [check-yarn-app-0] 启动jobserver 失败Can not get Hadoop Configuration of New-Offline
java.lang.RuntimeException: Can not get Hadoop Configuration of Test
at io.github.melin.spark.jobserver.support.ClusterManager.loadYarnConfig(ClusterManager.java:415) ~[conf/:?]
at io.github.melin.spark.jobserver.deployment.AbstractDriverDeployer.startApplication(AbstractDriverDeployer.java:192) ~[conf/:?]
at io.github.melin.spark.jobserver.deployment.YarnSparkDriverDeployer.buildJobServer(YarnSparkDriverDeployer.java:111) ~[conf/:?]
at io.github.melin.spark.jobserver.monitor.DriverPoolManager.lambda$startMinJobServer$1(DriverPoolManager.java:131) ~[conf/:?]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_362]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_362]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) ~[hadoop-common-2.8.5.jar:?]
at io.github.melin.spark.jobserver.support.ClusterManager.runSecured(ClusterManager.java:142) ~[conf/:?]
at io.github.melin.spark.jobserver.monitor.DriverPoolManager.startMinJobServer(DriverPoolManager.java:130) ~[conf/:?]
at io.github.melin.spark.jobserver.monitor.DriverPoolManager.lambda$afterPropertiesSet$0(DriverPoolManager.java:72) ~[conf/:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_362]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_362]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_362]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_362]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_362]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_362]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362]

集群信息看着已经获取到了,启动报错Can not get Hadoop Configuration ,看源码是yarn集群配置没获取到,但是/tmp/spark-jobserver下已经生成了集群的配置文件 core-site.xml hdfs-site.xml hive-site.xml
yarn-site.xml 都已经有了

melin commented

image

更新代码,已经修复