Error? spark-submit didn't work!
honghaolin opened this issue · 2 comments
Hi,
Thanks for providing those awesome docker images. It is very helpful! I am trying to follow the examples to setup docker-compose.yaml, but it seems it didn't work.
bash-5.0# ./submit.sh
Submit application /app/entrypoint.py to Spark master spark://spark-master:7077
Passing arguments
21/02/16 17:40:54 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/02/16 17:40:55 INFO SparkContext: Running Spark version 2.4.5
21/02/16 17:40:55 INFO SparkContext: Submitted application: testing
21/02/16 17:40:55 INFO SecurityManager: Changing view acls to: root
21/02/16 17:40:55 INFO SecurityManager: Changing modify acls to: root
21/02/16 17:40:55 INFO SecurityManager: Changing view acls groups to:
21/02/16 17:40:55 INFO SecurityManager: Changing modify acls groups to:
21/02/16 17:40:55 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
21/02/16 17:40:56 INFO Utils: Successfully started service 'sparkDriver' on port 46471.
21/02/16 17:40:56 INFO SparkEnv: Registering MapOutputTracker
21/02/16 17:40:56 INFO SparkEnv: Registering BlockManagerMaster
21/02/16 17:40:56 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
21/02/16 17:40:56 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
21/02/16 17:40:56 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-b86a6147-f336-47c1-91b4-f2cdea03bf81
21/02/16 17:40:56 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
21/02/16 17:40:56 INFO SparkEnv: Registering OutputCommitCoordinator
21/02/16 17:40:56 INFO Utils: Successfully started service 'SparkUI' on port 4040.
21/02/16 17:40:57 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://spark-submit:4040
21/02/16 17:40:57 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://spark-master:7077...
21/02/16 17:40:57 INFO TransportClientFactory: Successfully created connection to spark-master/172.30.0.2:7077 after 91 ms (0 ms spent in bootstraps)
21/02/16 17:40:57 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20210216174057-0000
21/02/16 17:40:57 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 37859.
21/02/16 17:40:57 INFO NettyBlockTransferService: Server created on spark-submit:37859
21/02/16 17:40:57 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
21/02/16 17:40:57 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, spark-submit, 37859, None)
21/02/16 17:40:57 INFO BlockManagerMasterEndpoint: Registering block manager spark-submit:37859 with 366.3 MB RAM, BlockManagerId(driver, spark-submit, 37859, None)
21/02/16 17:40:57 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, spark-submit, 37859, None)
21/02/16 17:40:57 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, spark-submit, 37859, None)
21/02/16 17:40:58 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
21/02/16 17:40:58 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/spark-warehouse').
21/02/16 17:40:58 INFO SharedState: Warehouse path is 'file:/spark-warehouse'.
21/02/16 17:40:59 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
I have setup a spark-master
, a spark-worker
, and a spark-submit
. I use command: tail -F anything
to keep spark-submit
running, and go into it to run submit.sh. The above is the log I get, and it is keep running, but I would expected to get some prints.
I use control
+ c
to stop it after 5 minutes, and here is the traceback log:
Traceback (most recent call last):
File "/app/entrypoint.py", line 29, in <module>
main()
File "/app/entrypoint.py", line 25, in main
print(nums.map(lambda x: x * x).collect())
File "/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 816, in collect
File "/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1255, in __call__
File "/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command
File "/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1152, in send_command
File "/usr/lib/python3.7/socket.py", line 589, in readinto
return self._sock.recv_into(b)
File "/spark/python/lib/pyspark.zip/pyspark/context.py", line 270, in signal_handler
KeyboardInterrupt
Here is my code in entrypoint.py:
from pyspark.conf import SparkConf
from pyspark.sql import SparkSession
def init_spark():
conf = SparkConf().setAppName("testing")
conf.setAll(
{
"spark.cores.max": "2",
"spark.driver.memory": "4g",
"spark.executor.cores": "2",
"spark.executor.memory": "4g",
"spark.sql.shuffle.partitions": "2",
}.items()
)
spark = SparkSession.builder.config(conf=conf).getOrCreate()
spark.sparkContext.setLogLevel("ERROR")
return spark
def main():
spark = init_spark()
nums = spark.sparkContext.parallelize([1, 2, 3, 4])
print(nums.map(lambda x: x * x).collect())
if __name__ == "__main__":
main()
Here is my docker-compose.yaml:
version: "3.0"
services:
spark-master:
image: bde2020/spark-master:2.4.5-hadoop2.7
hostname: spark-master
container_name: spark-master
ports:
- 8080:8080
- 7077:7077
environment:
INIT_DAEMON_STEP: setup_spark
spark-worker:
image: bde2020/spark-master:2.4.5-hadoop2.7
hostname: spark-worker
container_name: spark-worker
depends_on:
- spark-master
ports:
- 8081:8081
environment:
SPARK_MASTER: spark://spark-master:7077
spark-submit:
build: ./streaming
hostname: spark-submit
container_name: spark-submit
depends_on:
- spark-master
- spark-worker
environment:
SPARK_MASTER_NAME: spark-master
SPARK_MASTER_PORT: 7077
ENABLE_INIT_DAEMON: "false"
command: tail -F anything
I wonder if someone can give me some idea how to solve it? I must be missing something here.
Thanks!
Hi @honghaolin ,
thanks a lot for your feedback and sending us this issue.
I tried to reproduce it and didn't get the exact error you are getting.
You mentioned that you tried with docker-compose and it didn't work? I myself did setup it via docker-compose (of course I removed the tail -F anything
command just that it doesn't fail to tail any file which doesn't exists but instead print-out the results).
I reused your example and got this result:
➜ docker-spark git:(master) ✗ docker logs spark-submit
Submit application /app/entrypoint.py to Spark master spark://spark-master:7077
Passing arguments
21/03/22 22:24:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/03/22 22:24:08 INFO SparkContext: Running Spark version 3.1.1
21/03/22 22:24:08 INFO ResourceUtils: ==============================================================
21/03/22 22:24:08 INFO ResourceUtils: No custom resources configured for spark.driver.
21/03/22 22:24:08 INFO ResourceUtils: ==============================================================
21/03/22 22:24:08 INFO SparkContext: Submitted application: testing
21/03/22 22:24:08 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 2, script: , vendor: , memory -> name: memory, amount: 4096, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
21/03/22 22:24:08 INFO ResourceProfile: Limiting resource is cpus at 2 tasks per executor
21/03/22 22:24:08 INFO ResourceProfileManager: Added ResourceProfile id: 0
21/03/22 22:24:08 INFO SecurityManager: Changing view acls to: root
21/03/22 22:24:08 INFO SecurityManager: Changing modify acls to: root
21/03/22 22:24:08 INFO SecurityManager: Changing view acls groups to:
21/03/22 22:24:08 INFO SecurityManager: Changing modify acls groups to:
21/03/22 22:24:08 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
21/03/22 22:24:09 INFO Utils: Successfully started service 'sparkDriver' on port 46787.
21/03/22 22:24:09 INFO SparkEnv: Registering MapOutputTracker
21/03/22 22:24:09 INFO SparkEnv: Registering BlockManagerMaster
21/03/22 22:24:09 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
21/03/22 22:24:09 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
21/03/22 22:24:09 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
21/03/22 22:24:09 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-e71f1d26-8abd-407c-81f5-02ae2bc8008e
21/03/22 22:24:09 INFO MemoryStore: MemoryStore started with capacity 366.3 MiB
21/03/22 22:24:09 INFO SparkEnv: Registering OutputCommitCoordinator
21/03/22 22:24:09 INFO Utils: Successfully started service 'SparkUI' on port 4040.
21/03/22 22:24:09 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://spark-submit:4040
21/03/22 22:24:10 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://spark-master:7077...
21/03/22 22:24:10 INFO TransportClientFactory: Successfully created connection to spark-master/172.19.0.2:7077 after 43 ms (0 ms spent in bootstraps)
21/03/22 22:24:10 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20210322222410-0000
21/03/22 22:24:10 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43373.
21/03/22 22:24:10 INFO NettyBlockTransferService: Server created on spark-submit:43373
21/03/22 22:24:10 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
21/03/22 22:24:10 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, spark-submit, 43373, None)
21/03/22 22:24:10 INFO BlockManagerMasterEndpoint: Registering block manager spark-submit:43373 with 366.3 MiB RAM, BlockManagerId(driver, spark-submit, 43373, None)
21/03/22 22:24:10 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, spark-submit, 43373, None)
21/03/22 22:24:10 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, spark-submit, 43373, None)
21/03/22 22:24:10 INFO StandaloneAppClient$ClientEndpoint: Executor added: app-20210322222410-0000/0 on worker-20210322222407-172.19.0.4-37429 (172.19.0.4:37429) with 2 core(s)
21/03/22 22:24:10 INFO StandaloneSchedulerBackend: Granted executor ID app-20210322222410-0000/0 on hostPort 172.19.0.4:37429 with 2 core(s), 4.0 GiB RAM
21/03/22 22:24:10 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20210322222410-0000/0 is now RUNNING
21/03/22 22:24:10 INFO StandaloneSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
21/03/22 22:24:11 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/spark-warehouse').
21/03/22 22:24:11 INFO SharedState: Warehouse path is 'file:/spark-warehouse'.
[1, 4, 9, 16]
And the results are correct [1, 4, 9, 16]
as you apply x*x
on the map function.
I did use the latest version of Spark docker images bde2020/spark-python-template:3.1.1-hadoop3.2
to bundle the spark-submit example on Docker.
Feel free to comment in case you are still facing the same issue.
Best regards,
Hey @honghaolin ,
I'm closing this one for now but feel free to post in case you are still facing any issue with that.
Best regards,