GoogleCloudPlatform/cloud-data-quality

refactor code to allow cross-platform support

thinhha opened this issue · 1 comments

Right now the code only runs on BigQuery.

To add cross-platform support, at minimum we need to:

  • add a flag to the CLI to indicate the target platform
  • use this flag to perform query dry-run to validate SQL
  • check that the valid dbt configs for the target platform are provided
  • filter out rule bindings that do not target an entity in the corresponding platform
  • (ideally) refactor the query construction into a builder class, then pass that into a query engine class to execute, rather than passing raw SQL around. This builder class can customize the SQL to the target platform.
  • Allow specifying where the DQ summary results should be written to (e.g. validate data in GCS but write validation results to BigQuery). This requires decoupling the steps to generate the validation results and write them to the target sink.
  • Ensure all SQL and Jinja are cross-platform compatible.
  • Add automated integration test-suites for running the CLI against different platform

this is complete