datasets-br/sql-unifier

Assert, as validation and constraint strategies

ppKrauss opened this issue · 1 comments

As in the Schematron-validation philosophy... To validate any kind of constraint, but, mainlly, to validate frictionlessdata.io/table-schema/constraints

  • some SQL can obtained with MATERIALIZED VIEW , as CREATE UNIQUE INDEX... But all usual SQL constraints only with real tables. ... So, we need table? Datasets are "read only" in general. The most important is FOREGIN KEY/REFERENCE. Advantage of table use: to not reinvent the wheel.

  • standard asserts on JTD (json-type-definition). Example: tab-aoa JSON need to simulate CHECK(jsonb_array_length(j)>0) on insert...

  • non-SQL but standard: all implemented by SQL or PLpgSQL functions. See "not-null", "primary-key", etc. of frictionlessdata/constraints... Need standard asserts.

  • non-SQL non-standard: ... free constraints. See Schematron-validation-like use,


3-Layer architecture:

  1. Git with datapackage, CSVs, etc. The main reference and data origin... The single source of truth in the community context.
  2. The datasets on dataset.big, as single source of truth in the database context.
  3. Views, materialized views and tables for performance (fast queries) or for constraint-control in maintenance operations (validating new datasets or consolidating the ecosystem).

For complex asserts, suggestion is TAP standard, "Test Anything Protocol". See http://pgtap.org/

For simple asserts, see pg9.5+ https://postgres.cz/wiki/PL/pgSQL_(en)#ASSERT_statement