dbt-labs/dbt_metrics

[SEMANTIC-91] [Bug] metrics with count_distinct works in dbt 1.2 but not in dbt 1.3

Closed this issue · 3 comments

Is this a new bug in dbt_metrics?

  • I believe this is a new bug in dbt_metrics
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I run some tests before updating from dbt 1.2.3 + metrics 0.3.1 to dbt 1.3.1 + metrics 1.3.2.

A metric using count_distinct was working on the former setup but not in the latter. When computing the table calling the metric the following error occurred.

3:13:27  Database Error in model listing_amount_by_account_daily (models/05_pl_as24/metrics/daily/listing_amount_by_account_daily.sql)
13:13:27    Resources exceeded during query execution: The query could not be executed in the allotted memory. Peak usage: 132% of limit.
13:13:27    Top memory consumer(s):
13:13:27      ORDER BY operations: 99%
13:13:27      other/unattributed: 1%
13:13:27    
13:13:27    compiled Code at target/run/dwh_automotive/models/05_pl_as24/metrics/daily/listing_amount_by_account_daily.sql

I think the issue might be related to this bug but I'm not entirely sure.

Expected Behavior

The error doesn't appear.

Steps To Reproduce

We need a lot of data to get the same error. Nevertheless, I noticed that what breaks the thing is the order by clause at the end of the query. I've attached the generated SQL for the current setup (dbt 1.2.3 + metrics 0.3.1) and the new setup (dbt 1.3.1 + metrics 1.3.2).

Just to be sure, I've renamed the GCP project, datasets, and tables.
dbt_1.2.3-listing_amount_by_account_daily.txt
dbt_latest-listing_amount_by_account_daily.txt

Relevant log output

No response

Environment

% dbt --version
Core:
  - installed: 1.3.1
  - latest:    1.3.1 - Up to date!

Plugins:
  - bigquery: 1.3.0 - Up to date!

-----

packages

packages:
  - package: dbt-labs/dbt_utils
    version: 0.8.6
  - package: calogica/dbt_expectations
    version: 0.5.3
  - package: dbt-labs/audit_helper
    version: 0.5.0
  - package: data-mie/dbt_profiler
    version: 0.4.1
  - package: dbt-labs/metrics
    version: 0.3.1
  - package: elementary-data/elementary
    version: 0.5.3

Which database adapter are you using with dbt?

bigquery

Additional Context

No response

Hey @albton - thanks for raising this issue! We had noticed this pop up with datasets in Redshift & Postgres but were hoping it was constrained to databases that are more compute constrained. Now that we've seen it in BigQuery, its convinced me that we need to roll back this ordering functionality to be more simplistic.

I've merged in the changes to main and will release the new version on Friday. In the meantime, I'd love to see if this works in your local if you use the main branch instead of the released version. Here is the code you could use for that:

packages:
  - git: "https://github.com/dbt-labs/dbt_metrics.git" # git URL
    revision: main

Hi @callum-mcdata ! Thanks for checking this issue and reacting!

Luckily we "solved" the problem by deleting the metrics that were generating the issue from our data warehouse: it turned out they weren't really used. I should thank this bug for this :) I don't feel like taking the time to revert dev to the previous state to check right now. If I get some time this week I'll let you know.

No worries and no need to take the time! I'll close out this issue then and if you run into any other bugs in the future be sure to let us know - we appreciate you flagging this one!