dbt-labs/dbt-core

[Bug] unit test errors on join logic without table aliases

graciegoheen opened this issue · 1 comments

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I’m trying to unit test my fct_orders model:

select
    order_id,
    customer_id,
    order_date,
    days_since_ordered,
    candy_name,
    candy_price,
    candy_type,
    case 
        when candy_type = 'bar' and is_golden_ticket_candidate then true 
        else false
    end as does_contain_golden_ticket,

from {{ ref('stg_orders') }}
left join {{ ref('stg_candies')}}
    on stg_orders.candy_id = stg_candies.candy_id

which contains join logic without table aliases.

I define the following unit test:

unit_tests:
  - name: test_does_contain_golden_ticket # this is the unique name of the test
    description: >
      An order can only contain a golden ticket if it's been selected as a golden ticket candidate
      and it's a bar.
    model: fct_orders # name of the model I'm unit testing
    given: # the mock data for your inputs
      - input: ref('stg_orders')
        rows:
         - {order_id: 1, candy_id: 1, is_golden_ticket_candidate: true}
         - {order_id: 2, candy_id: 2, is_golden_ticket_candidate: false}
         - {order_id: 3, candy_id: 3, is_golden_ticket_candidate: true}
         - {order_id: 4, candy_id: 4, is_golden_ticket_candidate: false}
      - input: ref('stg_candies')
        rows:
         - {candy_id: 1, candy_type: BaR} # mixed case "bar" candy_type
         - {candy_id: 2, candy_type: BaR} 
         - {candy_id: 3, candy_type: not a bar} # candy_type is something besides "bar"
         - {candy_id: 4, candy_type: not a bar} 
    expect: # the expected output given the inputs above
      rows:
        - {order_id: 1, does_contain_golden_ticket: true} # candidate and bar
        - {order_id: 2, does_contain_golden_ticket: false} # not a candidate and bar
        - {order_id: 3, does_contain_golden_ticket: false} # candidate and not a bar
        - {order_id: 4, does_contain_golden_ticket: false} # not a candidate and not a bar

I get the error:

   Database Error
    000904 (42000): SQL compilation error: error line 99 at position 7
    invalid identifier 'STG_ORDERS.CANDY_ID'     

From the SQL:

create or replace temporary table DEVELOPMENT.dbt_ggoheen.test_does_contain_golden_ticket__dbt_tmp
         as
        (select * from (
        with __dbt__cte__stg_orders as (

-- Fixture for stg_orders
select 
    
        try_cast('1' as NUMBER(38,0))
     as order_id, try_cast(null as NUMBER(38,0)) as customer_id, 
    
        try_cast('1' as NUMBER(38,0))
     as candy_id, 
    
        try_cast('True' as BOOLEAN)
     as is_golden_ticket_candidate, try_cast(null as DATE) as order_date, try_cast(null as NUMBER(9,0)) as days_since_ordered
union all
select 
    
        try_cast('2' as NUMBER(38,0))
     as order_id, try_cast(null as NUMBER(38,0)) as customer_id, 
    
        try_cast('2' as NUMBER(38,0))
     as candy_id, 
    
        try_cast('False' as BOOLEAN)
     as is_golden_ticket_candidate, try_cast(null as DATE) as order_date, try_cast(null as NUMBER(9,0)) as days_since_ordered
union all
select 
    
        try_cast('3' as NUMBER(38,0))
     as order_id, try_cast(null as NUMBER(38,0)) as customer_id, 
    
        try_cast('3' as NUMBER(38,0))
     as candy_id, 
    
        try_cast('True' as BOOLEAN)
     as is_golden_ticket_candidate, try_cast(null as DATE) as order_date, try_cast(null as NUMBER(9,0)) as days_since_ordered
union all
select 
    
        try_cast('4' as NUMBER(38,0))
     as order_id, try_cast(null as NUMBER(38,0)) as customer_id, 
    
        try_cast('4' as NUMBER(38,0))
     as candy_id, 
    
        try_cast('False' as BOOLEAN)
     as is_golden_ticket_candidate, try_cast(null as DATE) as order_date, try_cast(null as NUMBER(9,0)) as days_since_ordered
),  __dbt__cte__stg_candies as (

-- Fixture for stg_candies
select 
    
        try_cast('1' as NUMBER(38,0))
     as candy_id, try_cast(null as character varying(16777216)) as candy_name, 
    
        try_cast('BaR' as character varying(16777216))
     as candy_type, try_cast(null as NUMBER(16,2)) as candy_price
union all
select 
    
        try_cast('2' as NUMBER(38,0))
     as candy_id, try_cast(null as character varying(16777216)) as candy_name, 
    
        try_cast('BaR' as character varying(16777216))
     as candy_type, try_cast(null as NUMBER(16,2)) as candy_price
union all
select 
    
        try_cast('3' as NUMBER(38,0))
     as candy_id, try_cast(null as character varying(16777216)) as candy_name, 
    
        try_cast('not a bar' as character varying(16777216))
     as candy_type, try_cast(null as NUMBER(16,2)) as candy_price
union all
select 
    
        try_cast('4' as NUMBER(38,0))
     as candy_id, try_cast(null as character varying(16777216)) as candy_name, 
    
        try_cast('not a bar' as character varying(16777216))
     as candy_type, try_cast(null as NUMBER(16,2)) as candy_price
) select
    order_id,
    customer_id,
    order_date,
    days_since_ordered,
    candy_name,
    candy_price,
    candy_type,
    case 
        when candy_type = 'bar' and is_golden_ticket_candidate then true 
        else false
    end as does_contain_golden_ticket,

from __dbt__cte__stg_orders
left join __dbt__cte__stg_candies
    on stg_orders.candy_id = stg_candies.candy_id
    ) as __dbt_sbq
    where false
    limit 0

        );

stg_orders does have a candy_id column. but the query being generated does not reference stg_orders, it references __dbt__cte__stg_orders. so this join:

from __dbt__cte__stg_orders
left join __dbt__cte__stg_candies
    on stg_orders.candy_id = stg_candies.candy_id
    ) as __dbt_sbq

is failing.

This will work as long as I alias my table names:

select
    order_id,
    customer_id,
    order_date,
    days_since_ordered,
    candy_name,
    candy_price,
    candy_type,
    case 
        when candy_type = 'bar' and is_golden_ticket_candidate then true 
        else false
    end as does_contain_golden_ticket

from {{ ref('stg_orders') }} as stg_orders
left join {{ ref('stg_candies')}} as stg_candies
    on stg_orders.candy_id = stg_candies.candy_id

Expected Behavior

I should be able to unit test join logic without table aliases

Which database adapter are you using with dbt?

Snowflake

I'm going to close this as won't do and document this as a known exception.

It is out best practice to alias table names.