GoogleCloudDataproc/spark-bigquery-connector

Add Integer based partition Support

slice-amandata opened this issue · 3 comments

Add Integer based partition Support

@amandaolens we do already have Integer range based partition support.
We are bound by the partitioning offered by BigQuery, and it is limited to one of two types:

Integer range partitioning, as described also in issue #867
Date/Time partitioning (daily, hourly, monthly, or yearly)

@vishalkarve15 can you share the reference for integer based partition

It's not released yet, you can refer to this for now: https://github.com/GoogleCloudDataproc/spark-bigquery-connector/blob/master/README-template.md (See partitionField option)
You can use the nightly builds to try it out for now (based on your spark version):

gs://spark-lib-nightly-snapshots/spark-2.4-bigquery-0.0.20230919.jar
gs://spark-lib-nightly-snapshots/spark-3.1-bigquery-0.0.20230919.jar
gs://spark-lib-nightly-snapshots/spark-3.2-bigquery-0.0.20230919.jar
gs://spark-lib-nightly-snapshots/spark-3.3-bigquery-0.0.20230919.jar
gs://spark-lib-nightly-snapshots/spark-3.4-bigquery-0.0.20230919.jar