words-sdsc/coursera

Error in Course VM, when trying to run PySpark

Closed this issue · 1 comments

EMCP commented
$ pyspark
jupyter: '/home/cloudera/anaconda3/bin/find_spark_home.py` is not a Jupyter command
'/home/cloudera/anaconda3/bin/pyspark: line 24: /bin/load-spark-env.sh: no such file or directory
'/home/cloudera/anaconda3/bin/pyspark: line 77: /bin/spark-submit: No such file or dir 
'/home/cloudera/anaconda3/bin/pyspark: line 77: exec: /bin/spark-submit: cannot execute: No such ...

While trying to run through the Coursera course Machine Learning with Big Data

I am working around this by just properly setting up a seperate environment.. but thought I should warn you, the VM as is doesn't work for PySpark. Also, it is a security risk to be using such an old VM.. I recommend performing a $ sudo yum update in the instructions OR recommend a newer VM image.