markporoshin/dbt-greenplum

add custom `greenplum__snapshot_staging_table` to support snapshots

Opened this issue · 4 comments

snapshots fail on greenplum because it relies on an older version of postgres. equivalent issues for dbt-postgres: dbt-labs/dbt-core#1665 dbt-labs/dbt-core#3064

the answer is to do as suggested in above issues and add a new macro greenplum__snapshot_staging_table() that is exactly the same as default__snapshot_staging_table, but the type casting on 3 lines changes.

'insert'::text as dbt_change_type,
...
'update'::text as dbt_change_type,
...
'delete'::text as dbt_change_type,

Did implementing it solved this on your side ?
I have tried but it fails later down the line :

17:00:06  Completed with 1 error and 0 warnings:
17:00:06
17:00:06  maximum recursion depth exceeded while calling a Python object
17:00:06
17:00:06  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

Hi 👋

I think it's that in the long CTE dbt v1.5.9 generates for running dbt snapshot the insertions_source_data CTE needs to CAST the md5(XYZ) as dbt_scd_id to text. Everything else in the long query runs until you try to UNION the insertions, updates, and deletes CTEs. The insertions CTE is creating the dbt_scd_id, it's the first one in the failing UNIONed query, and it's the only one that isn't inheriting a field data type.

How did I get myself to this issue? I'm having the same problem as the core issues 1665 and 3064 that Anders mentioned. Apparently, spinning up a postgres database in cPanel v110.0.23 gets me this version of postgres: PostgreSQL 9.6.22 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit.

Which I think means it's better to address upstream then in only the Greenplum adapter?

Or could someone tell me where the current version of the snapshot code is hiding? https://github.com/dbt-labs/dbt-core/blob/38f278cce0a5ccd2f92cc3381f3f0d02d30babe8/core/dbt/include/global_project/macros/materializations/snapshot/snapshot.sql#L142 is referenced in 3064 but that file doesn't exist under the 1.5.9 version tag. If it's a macro I can try over-writing it. If it's not over-ride-able then dbt Cloud still can't run snapshots on Postgres 9(which is totally legit to say is a me problem).

(As I was spelunking for a place to override with this change I saw the words postgres and index together and remembered that you actually are NOT supposed to put indexes on Greenplum tables. I know, I know, 🤨 "you have to be joking", but true. Even if you think it's working, just say no.)

Did implementing it solved this on your side ? I have tried but it fails later down the line :

17:00:06  Completed with 1 error and 0 warnings:
17:00:06
17:00:06  maximum recursion depth exceeded while calling a Python object
17:00:06
17:00:06  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

I managed to solve the issue by applying the fix suggested by @dataders and subsequently modifying the snapshot_merge.sql file located here

https://github.com/markporoshin/dbt-greenplum/blob/11827c25f5d6916009aa089b93197b8cf4cc1cb2/dbt/include/greenplum/macros/materializations/snapshot_merge.sql#L2C1-L4C15

to the following

{% macro greenplum__snapshot_merge_sql(target, source, insert_cols) -%}
  {{ return(postgres__snapshot_merge_sql(target, source, insert_cols)) }}
{% endmacro %}

This should resolve the recursion issue.

@dataders , @b00033811
Hi!

So I am pretty new to dbt and I'm stuck with the same problem. Can you please explain how to fix it?
When i run dbt snapshot command second time I am getting failed to find conversion function from unknown to text
on my greenplum
Where should I change and what exactly to make it work?

I have

 dbt version =1.5.9 and 
 Registered adapter: greenplum version =1.5.0

should I just upgrade both for problem to go away?

thanks in advance!