datafold/data-diff

Unexpected keyword `impersonate_service_account` error when running a dbt diff with potential fix

Stochastic-Squirrel opened this issue · 3 comments

Describe the bug

I am running a data-diff on a specific model using the following command:

data-diff --dbt -d --select user_metrics

Which yields the following output

Running with data-diff=0.9.8
12:48:00 INFO     Parsing file dbt_project.yml                                                                                                                                                                    dbt_parser.py:287
         INFO     Parsing file target/manifest.json                                                                                                             
 dbt_parser.py:280
12:48:02 INFO     config: prod_database='<REDACTED>' prod_schema='marts' prod_custom_schema='<custom_schema>' datasource_id=None                                                                          dbt_parser.py:159
         INFO     Parsing file profiles.yml                                                                                                                      dbt_parser.py:294
         DEBUG    Found PKs via META: ['metric_id']                                                                                                                                                               dbt_parser.py:449
         ERROR    Client.__init__() got an unexpected keyword argument 'impersonate_service_account'  

Describe the environment

macOS v14.0
data-diff== 0.9.8
dbt-bigquery==1.6.7
dbt-core==1.6.6

** Possible solution **
I had a look at the file ( data_diff/databases/bigquery.py) that was throwing the exception and I think because the impersonate_service_account key is not being popped like the keyfile arg, if a user does not user the impersonate service account functionality (I do not), then when the Big Query client is instantiated, the kw variable contains the unexpected keyword argument

As of 0.9.8 (lines 195-210)

        keyfile = kw.pop("keyfile", None)
        if keyfile:
            bigquery_service_account = import_bigquery_service_account()
            credentials = bigquery_service_account.Credentials.from_service_account_file(
                keyfile,
                scopes=["https://www.googleapis.com/auth/cloud-platform"],
            )
        elif kw.get("impersonate_service_account"):
            bigquery_service_account_impersonation = import_bigquery_service_account_impersonation()
            credentials = bigquery_service_account_impersonation.Credentials(
                source_credentials=credentials,
                target_principal=kw["impersonate_service_account"],
                target_scopes=["https://www.googleapis.com/auth/cloud-platform"],
            )

        self._client = bigquery.Client(project=project, credentials=credentials, **kw)

My suggestion

        keyfile = kw.pop("keyfile", None)
        impersonate_service_account = kw.pop("impersonate_service_account", None)
        if keyfile:
            bigquery_service_account = import_bigquery_service_account()
            credentials = bigquery_service_account.Credentials.from_service_account_file(
                keyfile,
                scopes=["https://www.googleapis.com/auth/cloud-platform"],
            )
        elif impersonate_service_account:
            bigquery_service_account_impersonation = import_bigquery_service_account_impersonation()
            credentials = bigquery_service_account_impersonation.Credentials(
                source_credentials=credentials,
                target_principal=impersonate_service_account,
                target_scopes=["https://www.googleapis.com/auth/cloud-platform"],
            )

        self._client = bigquery.Client(project=project, credentials=credentials, **kw)

which has resolved my issue

I am happy to open an MR if needed!

Cheers thanks for picking this up!

dlawin commented

Cheers thanks for picking this up!

Thanks for reporting this, and writing up the resolution. That made things very simple!

I tagged you as a contributor on the release, but not sure if that puts you in the contribution list or not -- with that said please feel free to open PRs!