/EnableSSLinHDP

Script and instructions to enable SSL encryption for various HDP components

Primary LanguageShell

Summary

Enabling SSL encryption for the Web UIs that make up Hadoop is a tedious process that requires planning, learning to use security tools, and lots of mouse clicks through Ambari's UI. This article aims to simplify the process by presenting a semi-automated, start-to-finish example that enables SSL for the below Web UIs in the Hortonworks Sandbox:

  1. Ambari
  2. HBase
  3. Oozie
  4. Ranger
  5. HDFS

Planning

There is no substitute for reading the documentation. If you plan on enabling SSL in a production cluster, then make sure you are familiar with SSL concepts and the communication paths between each HDP component. In addition, plan on cluster downtime. Here are some concepts that you should know well:

  1. Certificate Authority (CA)
  2. A Certificate Authority is a company that others trust that signs certificates for a fee. On a Mac you can view a list of CAs that your computer trusts by opening up the "Keychain Access" application and clicking on "System Roots". If you don't want to pay one of these companies to sign your certificates, then you can generate your own CA, just beware the Google Chrome and other browsers will present you with a privacy warning.
  3. Server SSL certificate
  4. These are files that prove the identity of a something, in our case: HDP services. Usually there is one certificate per hostname, and it is signed by a CA. There are two pieces of a certificate: the private and public keys. A private key is needed to encrypt a message and a public certificate is needed to decrypt the same message.
  5. Java private keystore
  6. When Java HDP services need to encrypt messages, they need a place to look for the private key part of a server's SSL certificate. This keystore holds those private keys. It should be kept secure so that attackers cannot impersonate the service. For this reason, each HDP component in this article has its own private keystore.
  7. Java trust keystore
  8. Just like my Mac has a list of CAs that it trusts, a Java process on a Linux machine needs the same. This keystore will usually hold the Public CA's certificate and any intermediary CA certificates. If a certificate was signed with a CA that you created yourself then also add the public part of a server's SSL certificate into this keystore.
  9. Ranger plugins
  10. Ranger plugins communicate with Ranger Admin server over SSL. What is important to understand is where each plugin executes and thus where server SSL certificates are needed. For HDFS, the execution is on the NameNodes, for HBase, it is on the RegionServers, for YARN, it is on the ResourceManagers. When you create server SSL certificates use the hostnames where the plugins execute.

Enable SSL on HDP Sandbox

This part is rather easy. Install the HDP 2.4 Sandbox and follow the below steps. If you use an older version of the Sandbox note that you'll need to change the Ambari password used in the script.

  1. Download my script
    wget "https://raw.githubusercontent.com/vzlatkin/EnableSSLinHDP/master/enable-ssl.sh"
    	
  2. Stop all services via Ambari (manually stop HDFS or Turn Off Maintenance Mode)
  3. Execute:
    /bin/bash enable-ssl.sh --all
    	
  4. Start all services via Ambari, which is now running on port 8443
  5. Goto Ranger Admin UI and edit HDFS and HBase services to set the Common Name for Certificate to sandbox.hortonworks.com

Enable SSL in production

There are two big reasons why enabling SSL in production can be more difficult than in a sandbox:

  1. If Hadoop components run in Highly Available mode. The solution for most instances is to create a single server SSL certificate and copy it to all HA servers. However, for Oozie you'll need a special server SSL certificate with CN=*.domainname.com
  2. If using Public CAs to sign server SSL certificates. Besides adding time to the process that is needed for the CA to sign your certificates you may also need additional steps to add intermediate CA certificates to the various Java trust stores and finding a CA that can sign non-FQDN server SSL certificates for Oozie HA

If you are using Ranger to secure anything besides HBase and HDFS then you will need to make changes to the script to enable extra plugins. The steps are similar to enabling SSL in Sanbox:

  1. Download my script
        wget "https://raw.githubusercontent.com/vzlatkin/EnableSSLinHDP/master/enable-ssl.sh"
    	
  2. Make changes to these variables inside of the script to reflect your cluster layout. The script uses these variables to generate certificates and copy them to all machines where they are needed. Below is an example for my three node cluster.
    server1="example1.hortonworks.com"
    server2="example2.hortonworks.com"
    server3="example3.hortonworks.com"
    OOZIE_SERVER_ONE=$server2
    NAMENODE_SERVER_ONE=$server1
    RESOURCE_MANAGER_SERVER_ONE=$server3
    HISTORY_SERVER=$server1
    HBASE_MASTER_SERVER_ONE=$server2
    RANGER_ADMIN_SERVER=$server1
    ALL_NAMENODE_SERVERS="${NAMENODE_SERVER_ONE} $server2"
    ALL_OOZIE_SERVERS="${OOZIE_SERVER_ONE} $server3"
    ALL_HBASE_MASTER_SERVERS="${HBASE_MASTER_SERVER_ONE} $server3"
    ALL_HBASE_REGION_SERVERS="$server1 $server2 $server3"
    ALL_REAL_SERVERS="$server1 $server2 $server3"
    ALL_HADOOP_SERVERS="$server1 $server2 $server3"
    export AMBARI_SERVER=$server1
    AMBARI_PASS=xxxx
    CLUSTER_NAME=cluster1
    	
  3. If you are going to pay a Public CA to sign your server SSL certificates then copy them to /tmp/security and name them as such:
    ca.crt
    example1.hortonworks.com.crt
    example1.hortonworks.com.key
    example2.hortonworks.com.crt
    example2.hortonworks.com.key
    example3.hortonworks.com.crt
    example3.hortonworks.com.key
    hortonworks.com.crt
    hortonworks.com.key
    	
    The last certificate is needed for Oozie if you have Oozie HA enabled. The CN of that certificate should be CN=*.domainname.com as described hereIf you are NOT going to use a Public CA to sign your certificates, then change these lines in the script to be relevant to your organization:
    /C=US/ST=New York/L=New York City/O=Hortonworks/OU=Consulting/CN=HortonworksCA
    	
  4. Stop all services via Ambari
  5. Execute:
    /bin/bash enable-ssl.sh --all
    	
  6. Start all services via Ambari, which is now running on port 8443
  7. Goto Ranger Admin UI and edit HDFS and HBase services to set the Common Name for Certificate to $NAMENODE_SERVER_ONE and $HBASE_MASTER_SERVER_ONE that you specified in the above script

If you chose not to enable SSL for some components or decide to modify the script to include others (please send me a patch) then be aware of these dependencies:

  • Setting up Ambari trust store is required before enabling SSL encryption for any other component
  • Before you enable HBase SSL encryption, enable Hadoop SSL encryption

Validation tips

  • View and verify SSL certificate being used by a server
    openssl s_client -connect ${OOZIE_SERVER_ONE}:11443 -showcerts  < /dev/null
    	
  • View Oozie jobs through command-line
    oozie jobs -oozie  https://${OOZIE_SERVER_ONE}:11443/oozie
    	
  • View certificates stored in a Java keystore
    keytool -list -storepass password -keystore /etc/hadoop/conf/hadoop-private-keystore.jks
    	
  • View Ranger policies for HDFS
    cat example1.hortonworks.com.key example1.hortonworks.com.crt  >> example1.hortonworks.com.pem
    curl --cacert /tmp/security/ca.crt --cert /tmp/security/example1.hortonworks.com.pem "https://example1.hortonworks.com:6182/service/plugins/policies/download/cluster1_hadoop?lastKnownVersion=3&pluginId=hdfs@example1.hortonworks.com-cluster1_hadoop"
    	
  • Validate that Ranger plugins can connect to Ranger admin server by searching for util.PolicyRefresher in HDFS NameNode and HBase RegionServer log files

References