dbt-labs/dbt-redshift

[Bug] status and rowcount not reported beginning with version `1.5.0`

soroushmalek67 opened this issue · 3 comments

Is this a new bug in dbt-redshift?

  • I believe this is a new bug in dbt-redshift
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Hello Everyone!
After updating to dbt 1.7 from 1.3, all of a sudden, for our jobs on our models (incremental, snapshots, tables) and for both dbt run and dbt build, dbt only returns this syntax: SUCCESS in XX.XXs, once the run is done for the model.

It used to be like this before: INSERT 0 14330 in 16.87s (first snapshot), which was very useful, especially for our incremental/snapshot models, as we wanted to check how many rows had been upserted in them.

Now, we don't see anything for our models in that way (second snapshot). note that we're using Registered adapter: redshift=1.7.5.

excerpts from dbt Labs response in community Slack thread in #dbt-cloud-cli-and-ide:

the message there is determined by the adapter_response field in the run_results.json object, and the message is determined entirely by whatever the adapter chooses to send back when you perform a given query (i.e the message for CREATE TABLE may be different in postgres vs snowflake!)

I believe this is a side effect of dbt-redshift changing how it interacts with Redshift, it switched to redshift-connector from psycopg2 in 1.5? Don't remember the exact adapter version.
The purpose of the change was ongoing compatibility as Redshift changes to a more Snowflake-y model (decoupling storage from compute, taking advantage of ra3 nodes, etc) and was driven by the AWS engineers who contribute to the adapter.
All this to say I think your best bet is to open an issue against dbt-redshift and ask for that metadata to be restored, but I assume it is not currently possible (otherwise it likely would have been preserved in the migration) and may require enhancements in an upstream project like redshift-connector itself.

Screenshot 2024-04-19 at 8 39 04 AM (1)
Screenshot 2024-04-19 at 8 39 42 AM (1)

Expected Behavior

To simply show how many rows were inserted in the incremental and snapshot jobs.

Steps To Reproduce

Based on dbt lab recommendation for troubleshooting:

  • The metadata should be restored, which might not be currently possible.
  • It may require enhancements in an upstream project, like redshift-connector itself

Relevant log output

No response

Environment

- OS: Mac
- Python: 3.9.18
- dbt-core: 1.7.13
- dbt-redshift: 1.7.7

@soroushmalek67 thanks for reporting this! This is an on-going regression due to a limitation of the redshift-connector Python driver. I've opened aws/amazon-redshift-python-driver#220 to request that SQL State be provided as a Cursor attribute.

Indeed this change began in #251, which was released initially in dbt-redshift 1.5.0. As @dave-connors-3 pointed out the status and rows modified comes from the driver's Cursor. there's a note on RedshiftConnectionManager.get_response()

redshift_connector.Cursor doesn't have a status message attribute but [because] this function is only used for successful run[s], so we can just return a dummy "SUCCESS"

def get_response(cls, cursor: redshift_connector.Cursor) -> AdapterResponse:
# redshift_connector.Cursor doesn't have a status message attribute but
# this function is only used for successful run, so we can just return a dummy
rows = cursor.rowcount
message = "SUCCESS"
return AdapterResponse(_message=message, rows_affected=rows)

Other data platforms similarly do not provide this information, accordingly the respective dbt adapters lack this information, namely: