yaooqinn/spark-ranger

security issue: Spark can Run SQL on files directly

Closed this issue · 6 comments

Currenlly, Spark can Run SQL on files directly, for example:
val sqlDF = spark.sql("SELECT * FROM parquet.examples/src/main/resources/users.parquet")

It make some other user can explore data directly from files.
What do you think @yaooqinn ?

use ranger hdfs plugin

But is any conflig between parquet hive table (user table) and hdfs parquet file (user parquet file) if i set diff policy ?

I want custom filter to prevent user do that, any suggestion which class to look for ?

I am not very sure, you can look up the logical plan to see if any pattern you can match to prohibit such behavior

I am not very sure, you can look up the logical plan to see if any pattern you can match to prohibit such behavior

ok, i will give a try.
Thanks

Closed issue as it doesnt relate to this plugin.
Thanks.