##Enable WebHcat on Cloudera manager
####Add webchat instance from cloudera manager
- Go to Cloudera manager and click Hive-> Instances -> Add Role Instances -> Select a host for WebHCat Server -> Finished
- Start WebHCat server, now the WebHCat server is running and metadata related operation should work fine.
- We need additional configuration for Hive, Pig, Mapreduce Jar and Mapreduce Streaming.
- Go to Hive -> Configuration -> WebHCat Server Default Group -> Advanced -> WebHCat Server Advanced Configuration Snippet(Safety Value) for webhcat-site.xml.
- the configuration for Hive, Pig, Mapreduce Jar and Mapreduce Streaming.
- Hive, Pig we need specify Archive HDFS path and hive executable command path in archive file. see example:
<property>
<name>templeton.libjars</name>
<value>/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/zookeeper/zookeeper-3.4.5-cdh5.7.1.jar</value>
<description>zookeeper jars to add to the classpath.</description>
</property>
<property>
<name>templeton.hive.archive</name>
<value>hdfs:///apps/webhcat/hive.tar.gz</value>
<description>The path to the Hive archive file in hdfs.</description>
</property>
<property>
<name>templeton.hive.path</name>
<value>hive.tar.gz/bin/hive</value>
<description>The path to the Hive executable.</description>
</property>
<property>
<name>templeton.pig.archive</name>
<value>hdfs:///apps/webhcat/pig.tar.gz</value>
<description>The path to the pig archive file in hdfs </description>
</property>
<property>
<name>templeton.pig.path</name>
<value>pig.tar.gz/bin/pig</value>
<description>The path to the pig executable.</description>
</property>
<property>
<name>templeton.streaming.jar</name>
<value>hdfs:///apps/webhcat/hadoop-streaming.jar</value>
<description>The path to mapreduce streaming jar.</description>
</property>
<property>
<name>templeton.hive.properties</name>
<value>hive.metastore.local=false,hive.metastore.uris=thrift://metadatahostname:9083,hive.metastore.sasl.enabled=true,hive.metastore.execute.setugi=true,hive.exec.mode.local.auto=false,hive.metastore.kerberos.principal=hive/_HOST@BGDATA.COM</value>
<description>Mainly forcus on the hive metastore properties</description>
</property>
-
Add TEMPLETON_HOME into WebHCat Server Environment Advanced Configuration Snippet (Safety Valve)
TEMPLETON_HOME=/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/hive-hcatalog
-
Prepare and upload hive archive into HDFS
- Step into hive directory and run. tar -hzcvf hive.tar.gz ./ , eg:
cd /opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/hive and run tar -hzcvf hive.tar.gz ./
- Step into hive directory and run. tar -hzcvf hive.tar.gz ./ , eg:
-
Prepare and upload pig archive into HDFS
- In order to talk with HCatalog we need add
HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive HCAT_HOME=/opt/cloudera/parcels/CDH/lib/hcatalog
into pig executable file. - Step into pig directory and run. tar -hzcvf pig.tar.gz ./ , eg:
cd /opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/pig and run tar -hzcvf pig.tar.gz ./
- In order to talk with HCatalog we need add
-
Upload all into hdfs, as /apps/wehbcat/..
- hadoop fs -copyFromLocal hive.tar.gz /apps/webhcat/hive.tar.gz
- hadoop fs -copyFromLocal pig.tar.gz /apps/wehbcat/pig.tar.gz
- hadoop fs -copyFromLocal hadoop-streaming.jar /apps/wehbcat/hadoop-streaming.jar
-
We might need change HTTP user id to make sure more than 1000.
- usermod -u 10001 HTTP
-
Done and restart server.
-
You might met permission issue to write job details into /templeton-hadoop/jobs. Give write pemission for all user to /templeton-hadoop/jobs