expectations break when expecting a Struct in BigQuery
JakoboEU opened this issue ยท 11 comments
Generating an expected result such as:
{% call dbt_unit_testing.expect() %}
SELECT
DATE('2023-02-11') AS snapshot_date,
STRUCT(
'01' AS event_id,
DATE('2022-01-01') AS created_on,
'Sam' AS name
) AS event
UNION ALL
SELECT
DATE('2023-02-11') AS snapshot_date,
STRUCT(
'02' AS event_id,
DATE('2022-01-01') AS created_on,
'Bob' AS name
) AS event
{% endcall %}
Results in the following SQL being run against BigQuery:
select * FROM
(select count(1) as expectation_count from (
select count(1) as count, `snapshot_date`,`event` from (
SELECT
DATE('2023-02-11') AS snapshot_date,
STRUCT(
'01' AS event_id,
DATE('2022-01-01') AS created_on,
'Sam' AS name
) AS event
UNION ALL
SELECT
DATE('2023-02-11') AS snapshot_date,
STRUCT(
'02' AS event_id,
DATE('2022-01-01') AS created_on,
'Bob' AS name
) AS event
) as s group by `snapshot_date`,`event`
Which generates the following error:
Grouping by expressions of type STRUCT is not allowed at [21:37]
Because group by snapshot_date, event
is illegal due to event
being a struct.
dbt_unit_testing version:
- package: EqualExperts/dbt_unit_testing
version: 0.3.2
dbt version:
Running with dbt=1.5.1
I get exactly the same issue.
I think if we use the bigquery function to_json_string
the comparison would get around this problem.
Please take a look at the latest release (0.4.0) and check if it fixes this issue.
Thank you!
I still get exactly the same behaviour. The generated SQL being run against bigquery has a group by struct column. This produces the bigquery error:
"Grouping by expressions of type STRUCT is not allowed" .
@dbrtly solution would work; however, the test subject would need modifying to use TO_JSON_STRING
as well. I guess if the library replaced STRUCT( ... )
with TO_JSON_STRING(STRUCT( ...))
in both the expectation and the test subject then this would resolve the issue. Seems a bit hacky though having to change the test subject.
Hi @JakoboEU, can you post here the model and test that you are running? Did you apply a column transformation to all of the struct columns in the model?
@psousa50 I used the same test as in the issue description.
{% call dbt_unit_testing.expect() %}
SELECT
DATE('2023-02-11') AS snapshot_date,
STRUCT(
'01' AS event_id,
DATE('2022-01-01') AS created_on,
'Sam' AS name
) AS event
UNION ALL
SELECT
DATE('2023-02-11') AS snapshot_date,
STRUCT(
'02' AS event_id,
DATE('2022-01-01') AS created_on,
'Bob' AS name
) AS event
{% endcall %}
which gets compiled to
/* {"app": "dbt", "dbt_version": "1.6.6", "profile_name": "default", "target_name": "local", "node_id": "test.xxxxxxxxx"} */
select * FROM
(select count(1) as expectation_count from (
select count(1) as count, `snapshot_date`, `event` from (
SELECT
DATE('2023-02-11') AS snapshot_date,
STRUCT(
'01' AS event_id,
DATE('2022-01-01') AS created_on,
'Sam' AS name
) AS event
UNION ALL
SELECT
DATE('2023-02-11') AS snapshot_date,
STRUCT(
'02' AS event_id,
DATE('2022-01-01') AS created_on,
'Bob' AS name
) AS event
) as s group by `snapshot_date`, `event`
) as exp) as exp_count,
(select count(1) as actual_count from (
select count(1) as count, `snapshot_date`, `event` from (
with
`src_xxx` as (
select * from `xxx`.`xxx`.`xxx` where false)
select * from (
WITH
raw_xxx AS (
SELECT
to_hex(md5(cast(coalesce(cast(guid as
string
), '') || '-' || coalesce(cast(last_modified_date as
string
), '') as
string
))) AS event_id,
guid AS review_id,
....
FROM `src_xxx`
)
SELECT * EXCEPT(row_num)
FROM (
SELECT
*,
row_number() OVER (PARTITION BY event_id) AS row_num
FROM raw_xxx
)
WHERE row_num = 1
) as t ) as s group by `snapshot_date`, `event`
) as act) as act_count
HI @JakoboEU, I would like to see the full test, the dbt-unit-testing.test
macro, where you define your column transformations unless you are defining them in dbt_project.yml
Hi @psousa50 , I am not aware of any transformations.
The whole test looks like
{{
config(
tags=['unit-test', 'unit_test'],
run_date="2023-02-11"
)
}}
{% call dbt_unit_testing.test('xxx', 'blah blah blah') %}
{% call dbt_unit_testing.expect() %}
# as above
{% endcall %}
{% endcall %}
And I don't have anything defined in my dbt_project.yml
Oh, I see the problem now ๐ For you to be able to use STRUCT columns you need to apply a column transformation to those columns, as explained in the README here. Please refer to this use case for instructions on how to do it.
If you have any issue implementing this please let me know, I'll try to assist you
That's the ticket.
Thank you. Works well.
Cool ๐