dataform-co/dataform

Operation: how to detect whether the current execution is a full-refresh

federicojasson opened this issue ยท 0 comments

Hello ๐Ÿ‘‹

Is there any way to detect whether a full-refresh is being executed from an operation?

This doesn't work, because incremental() is not defined in the context of an operation:

config {
  type: "operation"
}

${when(incremental(), ..., ...)}

Some context about my use case and why I think this would be helpful:

  • I'm implementing an incremental table that reads from a federated table (Cloud SQL) using EXTERNAL_QUERY.
  • To avoid fetching all records from a large table in the external database, I want to use SELECT MAX(...) FROM ${self()} to get a timestamp value that acts as checkpoint.
  • Because the query passed to EXTERNAL_QUERY cannot be built from a variable (see this SO question), I have to use the workaround of running EXECUTE IMMEDIATE <my_external_query>.
  • Running EXECUTE IMMEDIATE ... requires an operation.
  • An operation cannot use incremental(), and during a full-refresh calling SELECT MAX(...) FROM ${self()} is incorrect.

My current workaround to avoid using an operation is to write the EXECUTE IMMEDIATE ... statement in a pre_operations block in an incremental definition file. However, reusing the results of that statement in another file becomes messy.

Thanks!