linkedin/dynamometer

Miss ALL blocks

Closed this issue · 6 comments

To xkrogen ,
Good afternoon! The NameNode would miss all blocks and none DataNode was registered when Manual Workload Launch. These commands was used:
1.Execute the Block Generation Job:
./generate-block-lists.sh -fsimage_input_path hdfs://cluster/user/qa/dyno/fsimage/fsimage_0000000000282000135.xml -block_image_output_dir hdfs://cluster/user/qa/dyno/blocks -num_reducers 1 -num_datanodes 1

2.Manual Workload Launch:
./start-dynamometer-cluster.sh --hadoop_binary_path hadoop-2.7.3-1.2.7.tar.gz --conf_path /home/hdfs/Dynamometer/dynamometer-0.1.0-SNAPSHOT/bin/hadoop --fs_image_dir hdfs://cluster/user/qa/dyno/fsimage --block_list_path hdfs://cluster/user/qa/dyno/blocks

The NameNode UI showed :There are 100 missing blocks. The following files may be corrupted.

Hi @seanshaogh, thanks for reporting this issue! The steps you've used to launch it seem correct. Can you provide some more information about what happened:

  • Is there anything interesting/suspicious in the logs of the NameNode container?
  • Was the DataNode container launched at all? The AM logs should have links to the container and its logs. If so, was the DataNode process itself successfully launched within the container? Hopefully the DN logs will have some indication of what went wrong.

Hi @xkrogen ,thanks for your reply! I tried to launch one datanode in container and found none datanode registered in namenode. The AM logs showed DataNode process itself successfully launched within the container. It looks like the datanode could not connected to namenode . The AM logs show as below:

The DataNode logs:

Starting datanode with ID 000003
PWD is: /mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003
Saving original HADOOP_HOME as: /usr/ndp/current/yarn_nodemanager
Saving original HADOOP_CONF_DIR as: /usr/ndp/current/yarn_nodemanager/conf
Environment variables are set as:
(note that this doesn't include changes made by hadoop-env.sh)
XDG_SESSION_ID=c797411
YARN_RESOURCEMANAGER_OPTS= -Drm.audit.logger=INFO,RMAUDIT -Drm.audit.logger=INFO,RMAUDIT
HADOOP_LOG_DIR=/mnt/dfs/0/hadoop/yarn/log/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003
HADOOP_IDENT_STRING=yarn
SHELL=/bin/bash
HADOOP_HOME=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/hadoopBinary/hadoop-2.7.3-1.2.7
NM_HOST=hadoop
YARN_PID_DIR=/var/run/ndp/hadoop-yarn/yarn
HADOOP_PID_DIR=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node/pid
NN_EDITS_DIR=
HADOOP_PREFIX=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/hadoopBinary/hadoop-2.7.3-1.2.7
YARN_NICENESS=0
NM_AUX_SERVICE_mapreduce_shuffle=AAA0+gAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=

QTDIR=/usr/lib64/qt-3.3
NN_ADDITIONAL_ARGS=
NM_HTTP_PORT=8042
QTINC=/usr/lib64/qt-3.3/include
QT_GRAPHICSSYSTEM_CHECKED=1
LOCAL_DIRS=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446
USER=qa
JAVA_LIBRARY_PATH=/mnt/dfs/0/ndp/3.3.0/yarn_nodemanager/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/mnt/dfs/0/ndp/3.3.0/yarn_nodemanager/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
HADOOP_HEAPSIZE=
HADOOP_TOKEN_FILE_LOCATION=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/container_tokens
HADOOP_LIBEXEC_DIR=/usr/ndp/current/yarn_nodemanager/libexec
LOG_DIRS=/mnt/dfs/0/hadoop/yarn/log/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003
MALLOC_ARENA_MAX=4
YARN_NODEMANAGER_OPTS= -Dnm.audit.logger=INFO,NMAUDIT -Dnm.audit.logger=INFO,NMAUDIT
YARN_ROOT_LOGGER=INFO,EWMA,RFA
PATH=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/hadoopBinary/hadoop-2.7.3-1.2.7/bin:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/var/lib/ambari-agent
HADOOP_HDFS_HOME=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/hadoopBinary/hadoop-2.7.3-1.2.7
YARN_IDENT_STRING=yarn
HADOOP_COMMON_HOME=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/hadoopBinary/hadoop-2.7.3-1.2.7
PWD=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003
JAVA_HOME=/usr/jdk64/jdk1.8.0_152
NN_NAME_DIR=
HADOOP_YARN_HOME=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/hadoopBinary/hadoop-2.7.3-1.2.7
HADOOP_CLASSPATH=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/additionalClasspath/
LANG=en_US.UTF-8
HADOOP_CONF_DIR=/etc/hdfs/hdfs_namenode/2.7.3/0
HADOOP_OPTS=-Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/var/log/ndp/hadoop-hdfs/hdfs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/mnt/dfs/0/ndp/3.3.0/yarn_nodemanager -Dhadoop.id.str=yarn -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/mnt/dfs/0/ndp/3.3.0/yarn_nodemanager/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/var/log/ndp/hadoop-hdfs/hdfs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/mnt/dfs/0/ndp/3.3.0/yarn_nodemanager -Dhadoop.id.str=yarn -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/mnt/dfs/0/ndp/3.3.0/yarn_nodemanager/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/mnt/dfs/0/ndp/3.3.0/yarn_nodemanager/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true
YARN_TIMELINESERVER_HEAPSIZE=1024
YARN_LOG_DIR=/var/log/ndp/hadoop-yarn/yarn_nodemanager
LIBHDFS_OPTS=-Djava.library.path=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/hadoopBinary/hadoop-2.7.3-1.2.7/lib/native
HOME=/home/
SHLVL=4
DN_ADDITIONAL_ARGS=
YARN_LOGFILE=yarn-yarn-nodemanager-hadoop.log
YARN_CONF_DIR=/etc/mapreduce2/mapreduce_client/2.7.3/0
JVM_PID=8092
YARN_NODEMANAGER_HEAPSIZE=4096
HADOOP_MAPRED_HOME=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/hadoopBinary/hadoop-2.7.3-1.2.7
HADOOP_SSH_OPTS=-o ConnectTimeout=5 -o SendEnv=HADOOP_CONF_DIR
NM_PORT=45454
LOGNAME=qa
QTLIB=/usr/lib64/qt-3.3/lib
NM_AUX_SERVICE_spark_shuffle=
HADOOP_HOME_WARN_SUPPRESS=1
CONTAINER_ID=container_e105_1545030638014_43446_01_000003
LESSOPEN=||/usr/bin/lesspipe.sh %s
NN_FILE_METRIC_PERIOD=60
HADOOP_ROOT_LOGGER=INFO,RFA
XDG_RUNTIME_DIR=/run/user/5012
YARN_RESOURCEMANAGER_HEAPSIZE=6144
HADOOP_YARN_USER=yarn
_=/usr/bin/printenv

Going to sleep for 0 sec...
Executing the following:
/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/hadoopBinary/hadoop-2.7.3-1.2.7/bin/hadoop jar dynamometer.jar com.linkedin.dynamometer.SimulatedDataNodes -D fs.defaultFS=hdfs://hadoop1:9022/
-D dfs.datanode.hostname=hadoop
-D dfs.datanode.data.dir=file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//hadoop/hdfs/data,file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//mnt/dfs/0,file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//mnt/dfs/1,file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//mnt/dfs/2,file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//mnt/dfs/3,file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//mnt/dfs/4,file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//mnt/dfs/5,file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//mnt/dfs/6,file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//mnt/dfs/7,file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//mnt/dfs/8,file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//mnt/dfs/9,file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//mnt/dfs/10,file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node//mnt/dfs/11
-D dfs.datanode.ipc.address=0.0.0.0:0
-D dfs.datanode.http.address=0.0.0.0:0
-D dfs.datanode.address=0.0.0.0:0
-D dfs.datanode.directoryscan.interval=-1
-D fs.du.interval=43200000
-D fs.getspaceused.jitterMillis=21600000
-D hadoop.tmp.dir=/mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/dyno-node
-D hadoop.security.authentication=simple
-D hadoop.security.authorization=false
-D dfs.http.policy=HTTP_ONLY
-D dfs.nameservices=
-D dfs.web.authentication.kerberos.principal=
-D dfs.web.authentication.kerberos.keytab=
-D hadoop.http.filter.initializers=
-D dfs.datanode.kerberos.principal=
-D dfs.datanode.keytab.file=
-D dfs.domain.socket.path=
-D dfs.client.read.shortcircuit=false
BP-555526057-yarn-1534758010800
file:///mnt/dfs/0/hadoop/yarn/local/usercache/qa/appcache/application_1545030638014_43446/container_e105_1545030638014_43446_01_000003/blocks/block0
Started datanode at pid 8219
Waiting for parent process (PID: 8092) OR datanode process to exit
DataNodes will connect to NameNode at hadoop1:9022
Found 1 block listing files; launching DataNodes accordingly.
Waiting for DataNodes to connect to NameNode and init storage directories.

The Namenode logs:

2018-12-18 14:38:09,654 [0] - INFO [main:ApplicationMaster@164] - Initializing ApplicationMaster
2018-12-18 14:38:09,981 [327] - INFO [main:ApplicationMaster@229] - Application master for app, appId=43446, clustertimestamp=1545030638014, attemptId=1
2018-12-18 14:38:09,981 [327] - INFO [main:ApplicationMaster@258] - Starting ApplicationMaster
2018-12-18 14:38:10,103 [449] - WARN [main:NativeCodeLoader@62] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-12-18 14:38:10,304 [650] - INFO [main:NMClientAsyncImpl@107] - Upper bound of the thread pool size is 500
2018-12-18 14:38:10,306 [652] - INFO [main:ContainerManagementProtocolProxy@81] - yarn.client.max-cached-nodemanagers-proxies : 0
2018-12-18 14:38:10,510 [856] - INFO [main:ApplicationMaster@300] - Requested NameNode ask: Capability[<memory:2048, vCores:1>]Priority[0]
2018-12-18 14:38:10,518 [864] - INFO [main:ApplicationMaster@306] - Waiting on availability of NameNode information at hdfs://cluster/user/mammut_qa/.dynamometer/application_1545030638014_43446/nn_info.prop
2018-12-18 14:38:11,167 [1513] - WARN [main:DomainSocketFactory@117] - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2018-12-18 14:38:12,548 [2894] - INFO [AMRM Heartbeater thread:AMRMClientImpl@360] - Received new token for : hadoop1:45454
2018-12-18 14:38:12,551 [2897] - INFO [AMRM Callback Handler Thread:ApplicationMaster$RMCallbackHandler@483] - Got response from RM for container ask, allocatedCnt=1
2018-12-18 14:38:12,553 [2899] - INFO [AMRM Callback Handler Thread:ApplicationMaster$RMCallbackHandler@511] - Launching NAMENODE on a new container., containerId=container_e105_1545030638014_43446_01_000002, containerNode=hadoop1:45454, containerNodeURI=hadoop1:8042, containerResourceMemory=10240, containerResourceVirtualCores=1
2018-12-18 14:38:12,554 [2900] - INFO [Thread-7:ApplicationMaster$LaunchContainerRunnable@655] - Setting up container launch context for containerid=container_e105_1545030638014_43446_01_000002, isNameNode=true
2018-12-18 14:38:12,620 [2966] - INFO [Thread-7:ApplicationMaster$LaunchContainerRunnable@732] - Completed setting up command for namenode: [./start-component.sh, namenode, hdfs://cluster/user/mammut_qa/.dynamometer/application_1545030638014_43446, 1><LOG_DIR>/stdout, 2><LOG_DIR>/stderr]
2018-12-18 14:38:12,633 [2979] - INFO [Thread-7:ApplicationMaster$LaunchContainerRunnable@676] - Starting NAMENODE; track at: http://hadoop1:8042/node/containerlogs/container_e105_1545030638014_43446_01_000002/mammut_qa/
2018-12-18 14:38:12,635 [2981] - INFO [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl #0:NMClientAsyncImpl$ContainerEventProcessor@531] - Processing Event EventType: START_CONTAINER for Container container_e105_1545030638014_43446_01_000002
2018-12-18 14:38:12,638 [2984] - INFO [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl #0:ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData@260] - Opening proxy : hadoop1:45454
2018-12-18 14:38:12,709 [3055] - INFO [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl #0:ApplicationMaster$NMCallbackHandler@578] - NameNode container started at ID container_e105_1545030638014_43446_01_000002
2018-12-18 14:38:21,357 [11703] - INFO [main:ApplicationMaster@314] - NameNode information: {NM_HTTP_PORT=8042, NN_HOSTNAME=hadoop1, NN_HTTP_PORT=50077, NN_SERVICERPC_PORT=9022, NN_RPC_PORT=9002, CONTAINER_ID=container_e105_1545030638014_43446_01_000002}
2018-12-18 14:38:21,358 [11704] - INFO [main:ApplicationMaster@315] - NameNode can be reached at: hdfs://hadoop1:9002/
2018-12-18 14:38:21,358 [11704] - INFO [main:DynoInfraUtils@196] - Waiting for NameNode to finish starting up...
2018-12-18 14:40:09,981 [120327] - INFO [main:DynoInfraUtils@355] - Startup progress = 1.00; above threshold of 1.00; done waiting after 108621 ms.
2018-12-18 14:40:09,982 [120328] - INFO [main:DynoInfraUtils@199] - NameNode has started!
2018-12-18 14:40:09,982 [120328] - INFO [main:ApplicationMaster@760] - Looking for block listing files in hdfs://cluster/user/mammut_qa/dyno/blocksone
2018-12-18 14:40:10,002 [120348] - INFO [main:ApplicationMaster@331] - Requesting 1 DataNode containers with 2048MB memory, 1 vcores,
2018-12-18 14:40:10,002 [120348] - INFO [main:ApplicationMaster@340] - Finished requesting datanode containers
2018-12-18 14:40:10,002 [120348] - INFO [main:DynoInfraUtils@219] - Waiting for 0 DataNodes to register with the NameNode...
2018-12-18 14:40:10,012 [120358] - INFO [main:DynoInfraUtils@355] - Number of live DataNodes = 0.00; above threshold of 0.00; done waiting after 9 ms.
2018-12-18 14:40:10,028 [120374] - INFO [main:DynoInfraUtils@237] - Launching thread to trigger block reports for Datanodes with <38774742 blocks reported
2018-12-18 14:40:10,029 [120375] - INFO [main:DynoInfraUtils@299] - Waiting for MissingBlocks to fall below 1938.7372...
2018-12-18 14:40:10,031 [120377] - INFO [main:DynoInfraUtils@359] - Number of missing blocks: 6527.00
2018-12-18 14:40:11,702 [122048] - INFO [AMRM Heartbeater thread:AMRMClientImpl@360] - Received new token for : hadoop:45454
2018-12-18 14:40:11,702 [122048] - INFO [AMRM Callback Handler Thread:ApplicationMaster$RMCallbackHandler@483] - Got response from RM for container ask, allocatedCnt=1
2018-12-18 14:40:11,703 [122049] - INFO [AMRM Callback Handler Thread:ApplicationMaster$RMCallbackHandler@511] - Launching DATANODE on a new container., containerId=container_e105_1545030638014_43446_01_000003, containerNode=hadoop:45454, containerNodeURI=hadoop:8042, containerResourceMemory=10240, containerResourceVirtualCores=1
2018-12-18 14:40:11,703 [122049] - INFO [Thread-12:ApplicationMaster$LaunchContainerRunnable@655] - Setting up container launch context for containerid=container_e105_1545030638014_43446_01_000003, isNameNode=false
2018-12-18 14:40:11,744 [122090] - INFO [Thread-12:ApplicationMaster$LaunchContainerRunnable@732] - Completed setting up command for datanode: [./start-component.sh, datanode, hdfs://hadoop1:9022/, 0, 1><LOG_DIR>/stdout, 2><LOG_DIR>/stderr]
2018-12-18 14:40:11,744 [122090] - INFO [Thread-12:ApplicationMaster$LaunchContainerRunnable@676] - Starting DATANODE; track at: http://hadoop:8042/node/containerlogs/container_e105_1545030638014_43446_01_000003/mammut_qa/
2018-12-18 14:40:11,745 [122091] - INFO [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl #1:NMClientAsyncImpl$ContainerEventProcessor@531] - Processing Event EventType: START_CONTAINER for Container container_e105_1545030638014_43446_01_000003
2018-12-18 14:40:11,753 [122099] - INFO [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl #1:ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData@260] - Opening proxy : hadoop:45454
2018-12-18 14:40:11,764 [122110] - INFO [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl #2:NMClientAsyncImpl$ContainerEventProcessor@531] - Processing Event EventType: QUERY_CONTAINER for Container container_e105_1545030638014_43446_01_000003
2018-12-18 14:40:11,765 [122111] - INFO [org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl #2:ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData@260] - Opening proxy : hadoop:45454
2018-12-18 14:40:16,036 [126382] - INFO [main:DynoInfraUtils@359] - Number of missing blocks: 3730000.00
2018-12-18 14:40:28,045 [138391] - INFO [main:DynoInfraUtils@359] - Number of missing blocks: 12705849.00
2018-12-18 14:40:40,055 [150401] - INFO [main:DynoInfraUtils@359] - Number of missing blocks: 19387365.00
2018-12-18 14:42:10,081 [240427] - INFO [Thread-11:DynoInfraUtils$1@264] - Queueing 0 Datanodes for block report:
2018-12-18 14:43:10,179 [300525] - INFO [Thread-11:DynoInfraUtils$1@264] - Queueing 0 Datanodes for block report:
2018-12-18 14:44:10,209 [360555] - INFO [Thread-11:DynoInfraUtils$1@264] - Queueing 0 Datanodes for block report:
2018-12-18 14:45:10,246 [420592] - INFO [Thread-11:DynoInfraUtils$1@264] - Queueing 0 Datanodes for block report:
2018-12-18 14:46:10,286 [480632] - INFO [Thread-11:DynoInfraUtils$1@264] - Queueing 0 Datanodes for block report:

Thank you for sharing that! Though, the section you have labeled "NameNode logs" is actually the logs of the ApplicationMaster, not the NameNode -- you can find the NameNode logs by looking at the log line starting like "Starting NAMENODE; track at ...".

One thing I noticed is that you have a lot of blocks:

2018-12-18 14:40:10,002 [120348] - INFO [main:ApplicationMaster@340] - Finished requesting datanode containers
2018-12-18 14:40:10,002 [120348] - INFO [main:DynoInfraUtils@219] - Waiting for 0 DataNodes to register with the NameNode...
2018-12-18 14:40:10,012 [120358] - INFO [main:DynoInfraUtils@355] - Number of live DataNodes = 0.00; above threshold of 0.00; done waiting after 9 ms.
2018-12-18 14:40:10,028 [120374] - INFO [main:DynoInfraUtils@237] - Launching thread to trigger block reports for Datanodes with <38774742 blocks reported
2018-12-18 14:40:10,029 [120375] - INFO [main:DynoInfraUtils@299] - Waiting for MissingBlocks to fall below 1938.7372...
2018-12-18 14:40:10,031 [120377] - INFO [main:DynoInfraUtils@359] - Number of missing blocks: 6527.00

This seems to indicate that you have nearly 40M blocks in the system, which sounds like too much for a single DataNode. Can I suggest that you increase the number of DataNodes you launch, and increase their total memory allocation?

You may also want to adjust the values of the configs dyno.infra.ready.datanode-min-fraction (default 0.99) and dyno.infra.ready.missing-blocks-max-fraction (default 0.001). To make sure all DataNodes report and that there are no missing blocks, you can set these to 1.0 and 0.0, respectively -- by default they allow for a little bit of leeway.

@xkrogen I'm sorry to take so long to reply. This issue was solved. The fsimage file was not matched with running hadoop. Thanks for your help!

Great to hear @seanshaogh !