Configuring HDFS Master timeout
13k75 opened this issue · 2 comments
13k75 commented
- Flintrock version: 0.11.0
- Python version: 3.7.4
- OS: Linux
Hi Nicholas,
When starting my cluster, the HDFS configuration times out. Unlike the previous issue about m5.large's though, the Hadoop logs don't show anything amiss: the NameNode and SecondaryNameNode are starting and stopping normally.
Here is my config file:
services:
spark:
version: 2.4.4
hdfs:
version: 3.1.2
provider: ec2
providers:
ec2:
key-name: spark_cluster
identity-file: /home/kasra/distributed-setup/spark_cluster.pem
instance-type: t2.micro
region: us-west-2
ami: ami-04b762b4289fba92b # amazon linux 2
user: ec2-user
tenancy: default # default | dedicated
ebs-optimized: no # yes | no
instance-initiated-shutdown-behavior: terminate # terminate | stop
launch:
num-slaves: 1
install-hdfs: True
install-spark: True
debug: true
And I'm happy to provide the Hadoop logs too if you want them, though like I said they don't show any errors or warnings.
I would appreciate any help or insight you might have. Thanks!
nchammas commented
I don't know that Spark will work out of the box with Hadoop 3+. I would stick to Flintrock's default of Hadoop 2.8.5 and see if you still have any issues.
13k75 commented
Yes, that's exactly it! Hadoop 3+ switches a bunch of ports. In particular, instead of 50070 it's 9870.