
Enable Web HCatalog for Cloudera

##Enable  WebHcat on Cloudera manager

####Add webchat instance from cloudera manager

  • Go to Cloudera manager and click Hive-> Instances -> Add Role Instances -> Select a host for WebHCat Server -> Finished
  • Start WebHCat server, now the WebHCat server is running and metadata related operation should work fine.
  • We need additional configuration for Hive, Pig, Mapreduce Jar and Mapreduce Streaming.
    • Go to Hive -> Configuration -> WebHCat Server Default Group -> Advanced -> WebHCat Server Advanced Configuration Snippet(Safety Value) for webhcat-site.xml.
    • the configuration for Hive, Pig, Mapreduce Jar and Mapreduce Streaming.
    • Hive, Pig we need specify Archive HDFS path and hive executable command path in archive file. see example:
        <description>zookeeper jars to add to the classpath.</description>
        <description>The path to the Hive archive file in hdfs.</description>
        <description>The path to the Hive executable.</description>
        <description>The path to the pig archive file in hdfs </description>

        <description>The path to the pig executable.</description>
        <description>The path to mapreduce streaming jar.</description>
        <description>Mainly forcus on the hive metastore properties</description>
  • Add TEMPLETON_HOME into WebHCat Server Environment Advanced Configuration Snippet (Safety Valve) TEMPLETON_HOME=/opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/hive-hcatalog

  • Prepare and upload hive archive into HDFS

    • Step into hive directory and run. tar -hzcvf hive.tar.gz ./ , eg: cd /opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/hive and run tar -hzcvf hive.tar.gz ./
  • Prepare and upload pig archive into HDFS

    • In order to talk with HCatalog we need add HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive HCAT_HOME=/opt/cloudera/parcels/CDH/lib/hcatalog into pig executable file.
    • Step into pig directory and run. tar -hzcvf pig.tar.gz ./ , eg: cd /opt/cloudera/parcels/CDH-5.7.1-1.cdh5.7.1.p0.11/lib/pig and run tar -hzcvf pig.tar.gz ./
  • Upload all into hdfs, as /apps/wehbcat/..

    • hadoop fs -copyFromLocal hive.tar.gz /apps/webhcat/hive.tar.gz
    • hadoop fs -copyFromLocal pig.tar.gz /apps/wehbcat/pig.tar.gz
    • hadoop fs -copyFromLocal hadoop-streaming.jar /apps/wehbcat/hadoop-streaming.jar
  • We might need change HTTP user id to make sure more than 1000.

    • usermod -u 10001 HTTP
  • Done and restart server.

  • You might met permission issue to write job details into /templeton-hadoop/jobs. Give write pemission for all user to /templeton-hadoop/jobs