/spark-ranger

已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.

Primary LanguageScalaApache License 2.0Apache-2.0

Notice:

This library has been contribute to https://github.com/apache/submarine as a sub-module, and that module can still be used individually.

The project here will no longer be updated.

If you have any questions please go to

https://github.com/apache/submarine/tree/master/docs/submarine-security/spark/README.md

to learn how to use and give feedback to the apache submarine community by following https://submarine.apache.org/community/contributors.html

Spark SQL Ranger Security Plugin License codecov Build Status HitCount

ACL Management for Apache Spark SQL with Apache Ranger, enabling:

  • Table/Column level authorization
  • Row level filtering
  • Data masking

Build

Spark SQL Ranger Security Plugin is built based on Apache Maven,

mvn clean package -Pspark-2.3 -Pranger-1.0 -DskipTests

Currently, available profiles are:

Spark: -Pspark-2.3, -Pspark-2.4

Ranger: -Pranger-1.0, -Pranger-1.1, -Pranger-1.2 -Pranger-2.0

Usage

Installation

Place the spark-ranger-<version>.jar into $SPARK_HOME/jars.

Installation Addons

You can find some tips and known problems about this library here.

Configurations

Ranger admin client configurations

Create ranger-spark-security.xml in $SPARK_HOME/conf and add the following configurations for pointing to the right ranger admin server

<configuration>

    <property>
        <name>ranger.plugin.spark.policy.rest.url</name>
        <value>ranger admin address like http://ranger-admin.org:6080</value>
    </property>

    <property>
        <name>ranger.plugin.spark.service.name</name>
        <value>a ranger hive service name</value>
    </property>

    <property>
        <name>ranger.plugin.spark.policy.cache.dir</name>
        <value>./a ranger hive service name/policycache</value>
    </property>

    <property>
        <name>ranger.plugin.spark.policy.pollIntervalMs</name>
        <value>5000</value>
    </property>

    <property>
        <name>ranger.plugin.spark.policy.source.impl</name>
        <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value>
    </property>

</configuration>

Create ranger-spark-audit.xml in $SPARK_HOME/conf and add the following configurations to enable/disable auditing.

<configuration>

    <property>
        <name>xasecure.audit.is.enabled</name>
        <value>true</value>
    </property>

    <property>
        <name>xasecure.audit.destination.db</name>
        <value>false</value>
    </property>

    <property>
        <name>xasecure.audit.destination.db.jdbc.driver</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>

    <property>
        <name>xasecure.audit.destination.db.jdbc.url</name>
        <value>jdbc:mysql://10.171.161.78/ranger</value>
    </property>

    <property>
        <name>xasecure.audit.destination.db.password</name>
        <value>rangeradmin</value>
    </property>

    <property>
        <name>xasecure.audit.destination.db.user</name>
        <value>rangeradmin</value>
    </property>

</configuration>

Enable plugin via spark extensions

spark.sql.extensions=org.apache.ranger.authorization.spark.authorizer.RangerSparkSQLExtension