nchammas/flintrock

Configuring HDFS Master timeout

13k75 opened this issue · 2 comments

13k75 commented
  • Flintrock version: 0.11.0
  • Python version: 3.7.4
  • OS: Linux

Hi Nicholas,

When starting my cluster, the HDFS configuration times out. Unlike the previous issue about m5.large's though, the Hadoop logs don't show anything amiss: the NameNode and SecondaryNameNode are starting and stopping normally.

Here is my config file:

services:
  spark:
    version: 2.4.4

  hdfs:
    version: 3.1.2

provider: ec2

providers:
  ec2:
    key-name: spark_cluster
    identity-file: /home/kasra/distributed-setup/spark_cluster.pem
    instance-type: t2.micro
    region: us-west-2
    ami: ami-04b762b4289fba92b # amazon linux 2
    user: ec2-user
    tenancy: default  # default | dedicated
    ebs-optimized: no  # yes | no
    instance-initiated-shutdown-behavior: terminate  # terminate | stop

launch:
  num-slaves: 1
  install-hdfs: True
  install-spark: True

debug: true

And I'm happy to provide the Hadoop logs too if you want them, though like I said they don't show any errors or warnings.

I would appreciate any help or insight you might have. Thanks!

I don't know that Spark will work out of the box with Hadoop 3+. I would stick to Flintrock's default of Hadoop 2.8.5 and see if you still have any issues.

13k75 commented

Yes, that's exactly it! Hadoop 3+ switches a bunch of ports. In particular, instead of 50070 it's 9870.