Schema validator
pwalsh opened this issue · 6 comments
pwalsh commented
The SchemaValidator
checks that data conforms to a JSON Table Schema.
- Implement shared validator API
- Create a better reference spec for JTS itself (see: https://github.com/dataprotocols/schemas)
- Implement standalone run method
- Check headers are valid according to schema
- Check data is valid according to schema
- This can be very deep, so discussed with @rgrp to minimally start with date and number validation, and build out from there in iterations. Will create separate issues when this issue closes.
- Write tests as stand alone (via
self.run
) - Write tests as part of pipeline (via
PipelineValidator.run
)
Rabbit hole
Stuff that is beyond scope of this first pass, but that defines the larger scope of where we'd like to get.
- Generate schema from the data, if we do not have a schema #15
- foreignKeys: #17
constraints.minLength
,constraints.maxLength
,constraints.minimum
,constraints.maximum
needs discussion frictionlessdata/specs#161- Some issues around type and format. Would like to see this resolved before implementing deeper support of spec frictionlessdata/specs#159
pwalsh commented
pwalsh commented
pwalsh commented
pwalsh commented
pwalsh commented
This implementation supports validation of the following types:
- string
- integer
- number
- object
- array
- date, time, datetime
- boolean
- any :)
It doesn't really deal with formats, except in the case of the date/time types.
Other stuff that is not in our scope now is all recorded under the "Rabbit hole" heading of the main issue description above (which links out to specific issues).
Related, I've updated my type casting to have an almost identical API to messytables, but I am still holding off on depending on messytables directly as I'd like to keep py3 support (see okfn/messytables#117)