Health-Union/snowshu

Arbitrary SQL include qualifier

norton120 opened this issue · 0 comments

Why?

solving for specific problems often requires specific test data. This isn't exactly "sample" data, but it is data needed for testing - which is what SnowShu is for. So... we need a way to specify additional records (to those selected via sampling) that match a predicate.

Definition of Done:

maybe this could look like this?

-- say we need 4 bad order id's that constantly cause transform errors to _always_ be included in the sample...
  specified_relations: 
  - database: SNOWSHU_DEVELOPMENT
    schema: SOURCE_SYSTEM
    relation: ORDERS
    include_sql_predicate: "ORDER_ID IS IN (123, 444, 200, 101)" 

This should be explicitly the predicate definition (ie after the where clause)