
Spark connection issue on EC2

dorienh opened this issue · 5 comments

I've installed hadoop/spark with flintrock on EC2. MapReduce works fine, but when I try spark (both with spark-submit), as well as with Zeppelin, it gives me a connection error.

Below is the output from Zeppelin.

Did I mess up with the IP addresses, or do I need to open a TCP port or so?

Py4JJavaError: An error occurred while calling o120.partitions.
: Call From ip-172-31-19-18.ec2.internal/ to ip-172-31-19-18.ec2.internal:9000 failed on connection exception: Connection refused;

flintrock, version 2.0.0
version: 3.1.2
download-source: ""
version: 3.2.0
download-source: ""
OS: ami-0b5eea76982371e91 # Amazon Linux 2 5.10

Let's first make sure your cluster is in working order.

Does spark-shell or pyspark work if you SSH directly into the master?

Magically it works after logging out and in. Don't know if it helped, but I did run

$HADOOP_PREFIX/sbin/ start resourcemanager

Running a Spark shell from the master shouldn't require that.

In any case, are you all set then?

Glad you found it useful.