Connect spark-notebook to spark cluster
Closed this issue · 3 comments
Hi,
I'm trying to connect spark notebook to the spark cluster. By default it runs the notebooks in a local spark (the notebook jobs never appear in the spark master page) and when I try to connect it to the cluster created by the docker compose file the kernel dies.
Following spark-notebook's documentation on this, I'm adding the following to the notebook's metadata:
"customSparkConf": {
"spark.app.name": "Notebook",
"spark.master": "spark://spark-master:7077",
"spark.executor.memory": "1G"
},
Is there anything else I need to do/add?
Hi @Miguel-Alonso!
I've looked into the issue. The problem was the mismatch between Java versions in spark-notebook and spark. I migrated all the images to Java 8. You can find new docker-compose file in the root of the repo:
https://github.com/big-data-europe/docker-hadoop-spark-workbench/blob/master/docker-compose-java8.yml
If it does not fix the issue for you, feel free to reopen. -)
Hi @earthquakesan, thanks for that!
One last (small) detail: The docker compose file is looking for a missing hadoop-hive.env file. I'm just using the normal hadoop.env and, apart from an error in postgreSql, everything else seems to be running ok.
Thanks!
Hi @Miguel-Alonso!
Oops, forgot to push. It's there now. -)