elementary-data/dbt-data-reliability

Test failing to run since upgrading to 0.13

philk1991 opened this issue · 4 comments

Hey,

We're seeing a couple of tests failing to run since upgrading from 0.11.2 to 0.13. The tests in question have to use the . notation to look at the volume anomalies but this has been failing since the upgrade - FYI we're using databricks as our destination in dbt.

The test is displayed below.

      - name: transcription_jobs
        tests:
        - elementary.volume_anomalies:
            timestamp_column: updated_at.member0
            backfill_days: 1
            config:
              severity: warn
        - dbt_expectations.expect_table_row_count_to_be_between:
            min_value: 400000

I've also tried wrapping the timestamp column like this = timestamp_column: updated_at.member0. When doing this I receive a unicode error, which may or may not be related to this recent change on the python version of elementary.

Hey @philk1991 ,
Could you please share the error message you get?

(Also the tests code is all in the dbt package, so it can't be related to the Python change)

13:07:26  Completed with 1 error and 1 warning:
13:07:26  
13:07:26    Runtime Error in test elementary_source_volume_anomalies_glean_transcription_jobs_1__updated_at_member0 (dbt/models/staging/glean/_glean__sources.yml)
  
  [PARSE_SYNTAX_ERROR] Syntax error at or near '.'.(line 24, pos 63)
  
  == SQL ==
  /* {"app": "dbt", "dbt_version": "1.7.3", "dbt_databricks_version": "1.7.2", "databricks_sql_connector_version": "2.9.3", "profile_name": "glean_dbt", "target_name": "dev", "node_id": "test.glean_dbt.elementary_source_volume_anomalies_glean_transcription_jobs_1__updated_at_member0.7a850d2ba1"} */
  
      
    
      
          create or replace table `hive_metastore`.`dbt_phil`.`test_7a850d2ba1_elementary_source_volume_anomalies_glean_transcription_jobs_1__updated_at_me__metrics__tmp_20231219130534605947`
        
        
      using delta
        
        
        
        
        
        
        
        as
        
  
      
          
      with monitored_table as (
          select
              cast(updated_at.member0 as timestamp) as updated_at.member0
  ---------------------------------------------------------------^^^
          from `hive_metastore`.`glean_replica_airbyte`.`transcription_jobs`
          where cast(updated_at.member0 as timestamp) >= cast('2023-12-18T00:00:00+00:00' as timestamp)
          
      ),

It's the casting of the timestamp that's causing it to fail

Hey @philk1991 ,
The fix will be out with the next release 🙏🏻

@Maayan-s Amazing, thanks for your help!