dbt-labs/dbt_metrics

[SEMANTIC-236] [Bug] Calculation of multiple metrics within one model uses 0 instead of NULL as default metric value

Closed this issue · 1 comments

Is this a new bug in dbt_metrics?

  • I believe this is a new bug in dbt_metrics
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I have defined 2 metrics and I am using them in a model:

select * 
from {{ metrics.calculate(
    [
          metric('orders_created')
        , metric('orders_finished')
    ],
    grain='day',
    dimensions=['city_id']
) }}

The compiled code renders as

model_d5ece54740ba14b0f801414d351375b0__final as (
    
    select
        parent_metric_cte.date_day,
        parent_metric_cte.city_id,
        coalesce(orders_created, 0) as orders_created
    from model_d5ece54740ba14b0f801414d351375b0__aggregate as parent_metric_cte
)

and

model_fbc8530284b218b1202dabb400ac7567__final as (
    
    select
        parent_metric_cte.date_day,
        parent_metric_cte.city_id,
        coalesce(orders_finished, 0) as orders_finished
    from model_fbc8530284b218b1202dabb400ac7567__aggregate as parent_metric_cte
)

Notice the

coalesce(orders_created, 0)

and

coalesce(orders_finished, 0)

So in case there are cities that have values for one metric but don't have values for other metric, the metric value will be 0.

Expected Behavior

I expect the coalesce to use null as default:

coalesce(orders_created, null)

and

coalesce(orders_finished, null)

Here is an example of the two different behaviours (mock data produced elsewhere):
image

NULL means "I don't know". 0 means the value of 0. The latter might be very inappropriate in some cases. Let's take a simple example of users rating 3 different questions. There is a big difference if the rating is said to be 0 or null (did not answer this question).

Steps To Reproduce

See above.

Relevant log output

No response

Environment

dbt --version
Core:
  - installed: 1.4.5
  - latest:    1.4.5 - Up to date!

Plugins:
  - databricks: 1.4.2 - Up to date!
  - spark:      1.4.1 - Up to date!

Packages:
  - package: dbt-labs/metrics
    version: 1.4.1

Which database adapter are you using with dbt?

databricks, spark

Additional Context

No response

The answer from @callum-mcdata in Slack:

This is actually a property that you can set in your config!
We have folks who fall on both sides of what the default value is but in order to support secondary calcs we use 0.