[SEMANTIC-91] [Bug] metrics with count_distinct works in dbt 1.2 but not in dbt 1.3
Closed this issue · 3 comments
Is this a new bug in dbt_metrics?
- I believe this is a new bug in dbt_metrics
- I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
I run some tests before updating from dbt 1.2.3 + metrics 0.3.1 to dbt 1.3.1 + metrics 1.3.2.
A metric using count_distinct was working on the former setup but not in the latter. When computing the table calling the metric the following error occurred.
3:13:27 Database Error in model listing_amount_by_account_daily (models/05_pl_as24/metrics/daily/listing_amount_by_account_daily.sql)
13:13:27 Resources exceeded during query execution: The query could not be executed in the allotted memory. Peak usage: 132% of limit.
13:13:27 Top memory consumer(s):
13:13:27 ORDER BY operations: 99%
13:13:27 other/unattributed: 1%
13:13:27
13:13:27 compiled Code at target/run/dwh_automotive/models/05_pl_as24/metrics/daily/listing_amount_by_account_daily.sql
I think the issue might be related to this bug but I'm not entirely sure.
Expected Behavior
The error doesn't appear.
Steps To Reproduce
We need a lot of data to get the same error. Nevertheless, I noticed that what breaks the thing is the order by clause at the end of the query. I've attached the generated SQL for the current setup (dbt 1.2.3 + metrics 0.3.1) and the new setup (dbt 1.3.1 + metrics 1.3.2).
Just to be sure, I've renamed the GCP project, datasets, and tables.
dbt_1.2.3-listing_amount_by_account_daily.txt
dbt_latest-listing_amount_by_account_daily.txt
Relevant log output
No response
Environment
% dbt --version
Core:
- installed: 1.3.1
- latest: 1.3.1 - Up to date!
Plugins:
- bigquery: 1.3.0 - Up to date!
-----
packages
packages:
- package: dbt-labs/dbt_utils
version: 0.8.6
- package: calogica/dbt_expectations
version: 0.5.3
- package: dbt-labs/audit_helper
version: 0.5.0
- package: data-mie/dbt_profiler
version: 0.4.1
- package: dbt-labs/metrics
version: 0.3.1
- package: elementary-data/elementary
version: 0.5.3
Which database adapter are you using with dbt?
bigquery
Additional Context
No response
Hey @albton - thanks for raising this issue! We had noticed this pop up with datasets in Redshift & Postgres but were hoping it was constrained to databases that are more compute constrained. Now that we've seen it in BigQuery, its convinced me that we need to roll back this ordering functionality to be more simplistic.
I've merged in the changes to main
and will release the new version on Friday. In the meantime, I'd love to see if this works in your local if you use the main branch instead of the released version. Here is the code you could use for that:
packages:
- git: "https://github.com/dbt-labs/dbt_metrics.git" # git URL
revision: main
Hi @callum-mcdata ! Thanks for checking this issue and reacting!
Luckily we "solved" the problem by deleting the metrics that were generating the issue from our data warehouse: it turned out they weren't really used. I should thank this bug for this :) I don't feel like taking the time to revert dev to the previous state to check right now. If I get some time this week I'll let you know.
No worries and no need to take the time! I'll close out this issue then and if you run into any other bugs in the future be sure to let us know - we appreciate you flagging this one!