dharmeshkakadia/presto-hdinsight

Installation Script Action Failure

Closed this issue · 15 comments

I've tested this on both HDInsight 3.5 and 3.6. The execution fails on both head nodes with this error:

slider package --install --name presto1 --package build/presto-yarn-package.zip --replacepkg
2017-05-04 01:54:27,144 [main] INFO  service.AbstractService - Service org.apache.hadoop.yarn.client.api.impl.YarnClientImpl failed in state INITED; cause: java.net.BindException: Invalid yarn.resourcemanager.address value:0.0.0.0:8032 - see https://wiki.apache.org/hadoop/UnsetHostnameOrPort
java.net.BindException: Invalid yarn.resourcemanager.address value:0.0.0.0:8032 - see https://wiki.apache.org/hadoop/UnsetHostnameOrPort
	at org.apache.slider.client.SliderYarnClientImpl.serviceInit(SliderYarnClientImpl.java:81)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.slider.client.SliderClient.initHadoopBinding(SliderClient.java:490)
	at org.apache.slider.client.SliderClient.serviceInit(SliderClient.java:318)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.slider.core.main.ServiceLauncher.launchService(ServiceLauncher.java:182)
	at org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(ServiceLauncher.java:475)
	at org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(ServiceLauncher.java:403)
	at org.apache.slider.core.main.ServiceLauncher.serviceMain(ServiceLauncher.java:630)
	at org.apache.slider.Slider.main(Slider.java:49)
2017-05-04 01:54:27,148 [main] INFO  service.AbstractService - Service Slider Client failed in state INITED; cause: org.apache.hadoop.service.ServiceStateException: java.net.BindException: Invalid yarn.resourcemanager.address value:0.0.0.0:8032 - see https://wiki.apache.org/hadoop/UnsetHostnameOrPort
org.apache.hadoop.service.ServiceStateException: java.net.BindException: Invalid yarn.resourcemanager.address value:0.0.0.0:8032 - see https://wiki.apache.org/hadoop/UnsetHostnameOrPort
	at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
	at org.apache.slider.client.SliderClient.initHadoopBinding(SliderClient.java:490)
	at org.apache.slider.client.SliderClient.serviceInit(SliderClient.java:318)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.slider.core.main.ServiceLauncher.launchService(ServiceLauncher.java:182)
	at org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(ServiceLauncher.java:475)
	at org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(ServiceLauncher.java:403)
	at org.apache.slider.core.main.ServiceLauncher.serviceMain(ServiceLauncher.java:630)
	at org.apache.slider.Slider.main(Slider.java:49)
Caused by: java.net.BindException: Invalid yarn.resourcemanager.address value:0.0.0.0:8032 - see https://wiki.apache.org/hadoop/UnsetHostnameOrPort
	at org.apache.slider.client.SliderYarnClientImpl.serviceInit(SliderYarnClientImpl.java:81)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	... 8 more
Exception: java.net.BindException: Invalid yarn.resourcemanager.address value:0.0.0.0:8032 - see https://wiki.apache.org/hadoop/UnsetHostnameOrPort
2017-05-04 01:54:27,149 [main] ERROR main.ServiceLauncher - Exception: java.net.BindException: Invalid yarn.resourcemanager.address value:0.0.0.0:8032 - see https://wiki.apache.org/hadoop/UnsetHostnameOrPort
org.apache.hadoop.service.ServiceStateException: java.net.BindException: Invalid yarn.resourcemanager.address value:0.0.0.0:8032 - see https://wiki.apache.org/hadoop/UnsetHostnameOrPort
	at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
	at org.apache.slider.client.SliderClient.initHadoopBinding(SliderClient.java:490)
	at org.apache.slider.client.SliderClient.serviceInit(SliderClient.java:318)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.slider.core.main.ServiceLauncher.launchService(ServiceLauncher.java:182)
	at org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(ServiceLauncher.java:475)
	at org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(ServiceLauncher.java:403)
	at org.apache.slider.core.main.ServiceLauncher.serviceMain(ServiceLauncher.java:630)
	at org.apache.slider.Slider.main(Slider.java:49)
Caused by: java.net.BindException: Invalid yarn.resourcemanager.address value:0.0.0.0:8032 - see https://wiki.apache.org/hadoop/UnsetHostnameOrPort
	at org.apache.slider.client.SliderYarnClientImpl.serviceInit(SliderYarnClientImpl.java:81)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	... 8 more
2017-05-04 01:54:27,150 [main] INFO  util.ExitUtil - Exiting with status 56
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/custom_actions/scripts/run_customscriptaction.py", line 194, in <module>
    ExecuteScriptAction().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 306, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/custom_actions/scripts/run_customscriptaction.py", line 179, in actionexecute
    ExecuteScriptAction.execute_bash_script(bash_script, scriptpath, scriptparams)
  File "/var/lib/ambari-agent/cache/custom_actions/scripts/run_customscriptaction.py", line 149, in execute_bash_script
    raise Exception("Execution of custom script failed with exit code",exitcode)
Exception: ('Execution of custom script failed with exit code', 56)

As best as I can tell, yarn.resourcemanager.address is not set on HDInsight and the installation is using a default. Instead yarn.resourcemanager.address.rm1 and yarn.resourcemanager.address.rm2 are set on HDInsight for HA mode.

Would configuring yarn.resourcemanager.address to the address of rm1 fix the issue, or is there something more comprehensive with the installation process that should be done?

I think you are running this on Spark cluster. Can you try it on a hadoop cluster ?

You were correct. This error came from running the installation action on a Spark cluster.

Hi,

I'm getting the same error, also with Spark on HDInsights. Is it not possible to somehow change the default port during presto installation? Readme suggests that for any customizations there needs to be existing presto cluster running.

Can we maybe add some environment variable before we run the script?

  1. The script is not supported on Spark cluster. You will need to run this on a hadoop cluster.
  2. You can change the default port by changing site.global.presto_server_port in createconfigs.sh. Does customizing after installation (as described in readme) not work for you ? What is the scenario you are trying to achieve ?
  1. Why is this? Spark on HDInsights runs on yarn, not standalone. Only difference that I see is the port on which yarn is listening (8050 insted of 8032). Are there other issues when Spark and Presto are running side by side?
    I think I mislead you with the port question. I was asking if it's possible to give the script a different port for yarn. When spark is installed, yarn is on 8050.

  2. I understood from the readme that presto should already be installed on the cluster prior to customization. Since installation failed, I did not even try.

The limitation of Presto running on Spark cluster is not due to the port. This presto installation is managed via Apache slider. We don't ship slider on Spark cluster. That's why you need a Hadoop cluster.

Hmm, now I'm a bit confused. If I connect to my cluster over SSH and run slider version this is what i get:

sshuser@hn0-cigspa:~$ slider version
2017-07-07 05:20:24,119 [main] INFO client.SliderClient - Slider Core-0.92.0.2.6.0.10-29 Built against commit# 790c3934c8 on Java 1.7.0_21 by jenkins
2017-07-07 05:20:24,121 [main] INFO client.SliderClient - Compiled against Hadoop 2.7.3.2.6.0.10-29
2017-07-07 05:20:24,128 [main] INFO client.SliderClient - Hadoop runtime version (HEAD detached at aaf0730) with source checksum 2e8e51d932beeff9cb25224a3a758e2 and build date 2017-05-15T18:52Z

Same thing on one of the nodes.

On Spark cluster only the slider client binary is installed, but Slider is not configured properly. That is why you see "Invalid yarn.resourcemanager.address value:0.0.0.0:8032".

Spark and presto running on a same cluster will also compete for resources, since both require significant amount of memory. I highly recommend using separate presto and spark cluster.

I am afraid you have to configure presto yourself for that scenario.

@grbinho, if you need a less expensive presto development environment, another option to consider is just running presto standalone on a vm and avoid a second cluster + hdinsight markup.

@grbinho Also, if you want a local docker based environment, consider using https://github.com/arsenvlad/docker-presto-adls-wasb which has support for Azure storage support.
cc @arsenvlad