GoogleCloudDataproc/spark-bigquery-connector

Request to Stop Enforcing Delete Permissions on Materialization Dataset

Closed this issue · 1 comments

Hi, most organizations as mine, do not provide delete access to service accounts on datasets and there are no temp datasets that are allotted for the service account which can be used for materialization. This causes failures while reading from views. Would like to understand why is the materialization enforced on the user for basic operations like reading from a view. I understand that the data needs to be materialized by Big Query for obvious reasons but it makes no sense for a user to worry about this. Any help or directions on this is much appreciated. Thank you!

Hi @reynoldspravindev,

Please check the reading from views documentation https://github.com/GoogleCloudDataproc/spark-bigquery-connector?tab=readme-ov-file#reading-from-views
The Bigquery Read Session requires a table which needs to be materialized by the connector.
Regarding the materialization dataset, the connector may not have access to create/delete datasets in the user account, so it requires an existing dataset from the user.