calogica/dbt-expectations

[Feature Request] Support regexp_instr for Trino

aezomz opened this issue · 10 comments

Is your feature request related to a problem? Please describe.
Lets support regex expectation for Trino on
https://github.com/calogica/dbt-expectations/blob/b21b13f020f5df2c90885d0715a2e713d1204700/macros/regex/regexp_instr.sql

Describe the solution you'd like
We can use this,
https://trino.io/docs/current/functions/regexp.html
regexp_position(string, pattern, start, occurrence) → integer

Describe alternatives you've considered
Using dbt custom test. but i think we should support regexp check for Trino

Additional context
Willing to contribute, please guide me if the file is the only change I need to make and tests to add.
Will look at the contributor read me...
https://github.com/calogica/dbt-expectations/blob/b21b13f020f5df2c90885d0715a2e713d1204700/macros/regex/regexp_instr.sql

Hi, we don't support Trino b/c we don't have integration testing for it. You can however add a shim to trino-utils to support that for that adapater.

@clausherther will i still be able to use dbt expectation regexp? or it have to come from trino-utils directly?

You'd add the trino implementation of regexp to the trino-utils package similar to this version of the deduplicate macro from dbt-utils https://github.com/starburstdata/dbt-trino-utils/blob/main/macros/dbt_utils/sql/deduplicate.sql or what they did here for log_natural for T-SQL https://github.com/dbt-msft/tsql-utils/blob/main/macros/dbt_expectations/math/log_natural.sql

In your dbt project, you then need to include the trino-utils package and also update your package path, similar to these examples, so that your project looks for matching macros for your adapter if available. You'd call this with dbt_expectations.regexp() though, so any macro calling regexp will just use the correct version for the adapter. Hope that makes sense.
https://github.com/dbt-msft/tsql-utils#installation-instructions

Thank you so much for the detailed explanation. I will check for the similar implementation.

wjhrdy commented

I hacked together a solution that works ontop of the trino_utils package since they haven't responded to the original MR

starburstdata/dbt-trino-utils#21 (comment)

wjhrdy commented

So I fixed this in dbt-trino-utils but they want to now add the fixes to this repo. starburstdata/dbt-trino-utils#35 (comment)

Is that something this repo would accept? It seems like it would be a bunch of work to get the testing apparatus in this repo for trino/starburst and maybe out of the scope of this repo. Please comment on the linked issue above about if you would accept this kind of addition and what you would need if you would accept it. Thanks.

Let me think about this. Someone just pushed a PR to dbt-date to add Trino support, and depending on how maintainable that is (i.e. particular if CI works smoothly) we can think about extending this to dbt-expectations.

Ah, looks like @damian3031 is also in the mix on the trino repo issue you linked 👍

wjhrdy commented

I realized dbt_expectations is not written in a way that everything can be overridden.

here is an example

starburstdata/dbt-trino-utils#21 (comment)