calogica/dbt-date

day_of_week, day_of_month, day_of_year are not cast to integer by default

sparcs opened this issue · 11 comments

Hey folks, this is just a minor issue.

The fields:

  • day_of_week
  • day_of_month
  • day_of_year

are not cast to integer type by default. This seems to be the case when used with Redshift and is perhaps present in other platforms as well. This is unlike the week_of_year field which is cast to integer. I can submit a pull request for casting the above fields as well if this feels like a reasonable thing to expect.

Thanks! That's good to know and an easy fix. Happy to take any PRs if you'd like, or I'll try and get to it some time this weekend.

Sure, will send it right away. Thanks!

FYI, according to the Redshift docs, this is supposed to return a double? https://docs.aws.amazon.com/redshift/latest/dg/r_DATE_PART_function.html
Or am I reading this incorrectly? So, you're looking to cast the double to an int?

Looks like in BQ, extracts return a long https://cloud.google.com/bigquery/docs/reference/standard-sql/date_functions#extract so makes sense to probably cast this in the Redshift case to a bigint.
Snowflake returns a smaller int (like Number(2,0)) for things like day of week.
I'd love to only do that for Redshift though, so we'll likely need a new Redshift specific date_part macro rather than modifying the default version.

@clausherther you are right. Redshift does indeed return a double.

However, since we are calling date_part in each of these cases with known parameters such as 'day' or 'dayofyear', it should be safe to cast to integer, right?

Implementing a Redshift specific date_part is a good idea. Can you point me to some Redshift specific switch used elsewhere?

Makes sense!
We don't actually have any Redshift specific macros, only BQ, Snowflake and PG where needed.
However, you could model this after the bigquery version of date_part and replace bigquery with redshift and then implement the Redshift specific version.
Mind you, you'll have run the integration test suite on Redshift on your end. Unfortunately, I don't have access to a Redshift cluster which is why we don't explicitly support Redshift in this package.

Cool. So I'll implement Redshift specific macros in the day_of_week, day_of_month, and day_of_year files. And will cast to bigint in each case. Somewhat averse to making this change in dbt_date.date_part since it may be used in other places. Do you agree?

Do I have to set dbt_utils_dispatch_list: [redshift] for this to work?

That sounds good. I don't think you have to do anything with the dispatch_list. The dispatch should just work when you run this against Redshift based on your target's adapter type.

I don't think you have to do anything with the dispatch_list. The dispatch should just work when you run this against Redshift based on your target's adapter type.

Oh Ok. Thanks!

This is more involved than I originally expected :) Will fix it and run the tests over the weekend.

Ha, yes, it always is a bit more involved. Thanks!

Closed via #28