recap-build/recap

Add spec test for built-in logical types

Closed this issue · 5 comments

https://github.com/recap-build/recap/pull/346/files#r1282108522

Currently the JSON Schema meta schema for recap does not know about built in logical types.

A PR to resolve this issue should:

  • add a new def for each of the logical types
  • add valid test recap specs using the logical types

Working on this.

So, I bumped into a little snag. It relates to what I was saying in #350 about runtime vs. compile-time validity.

This test does three things:

@pytest.mark.parametrize("schema_path", invalid_schema_paths)
def test_invalid_schemas(schema_path: str, meta_schema: dict):
    with open(schema_path, "r") as file:
        schema = json.load(file)
        with pytest.raises(ValidationError):
            validate(schema, meta_schema)
        with pytest.raises((ValueError, TypeError)):
            registry = RecapTypeRegistry()
            recap_type = from_dict(clean_dict(schema), registry)
            check_aliases(recap_type, registry)

It (1) validates that validate fails AND ((2) validates that check_aliases fails OR (3) validates that from_dict() fails).

Upon updating the JSON schema metaschema, I found a few tests were failing, such as:

[
  {
    "doc": "doc",
    "type": "invalidtype",
    "bytes": 255
  }
]

This is in "invalid", and an exception is expected. Yet, I get:

    @pytest.mark.parametrize("schema_path", invalid_schema_paths)
    def test_invalid_schemas(schema_path: str, meta_schema: dict):
        with open(schema_path, "r") as file:
            schema = json.load(file)
>           with pytest.raises(ValidationError):
E           Failed: DID NOT RAISE <class 'jsonschema.exceptions.ValidationError'>

This is happening because the schema is valid according to the JSON schema metaschema (we can't validate aliases in JSON schema).

It was passing before because "additionalProperties": false was set for all types, which was causing RecapAliasReference to be used, and bytes is not a member of RecapAliasReference. I removed this because attribute overrides are allowed for aliases, and non-defined types are ignored (they're stored in extra_attrs in Python's RecapType).

So, the question is what to do here. Do we want to require that failing JSON schema metaschema == fails from_dict? Or do we want fails metaschema or fails check_alias == fails from_dict? Or should we wrap JSON schema metaschema validation inside a validate method that also validates aliases?

:notsureif:

We could also differentiate in the tests between invalid because of dangling alias ref vs. invalid because of CST structure...

Or we could make the test like this:

@pytest.mark.parametrize("schema_path", invalid_schema_paths)
def test_invalid_schemas(schema_path: str, meta_schema: dict):
    with open(schema_path, "r") as file:
        schema = json.load(file)
        with pytest.raises((ValidationError, ValueError, TypeError)):
            validate(schema, meta_schema)
            registry = RecapTypeRegistry()
            recap_type = from_dict(clean_dict(schema), registry)
            check_aliases(recap_type, registry)

In this case, the test passes as long as validate, from_dict, or check_aliases fails.

I think I got it figured out. This is what I did:

@pytest.mark.parametrize("schema_path", invalid_schema_paths)
def test_invalid_schemas(schema_path: str, meta_schema: dict):
    with open(schema_path, "r") as file:
        schema = json.load(file)

        try:
            validate(schema, meta_schema)
            is_valid_metaschema = True
        except ValidationError:
            # Got a validation error, so the metaschema is invalid.
            # The test should pass, since we're expecting an invalid schema.
            is_valid_metaschema = False

        # metaschema passed, so aliases should fail
        if is_valid_metaschema:
            registry = RecapTypeRegistry()
            # from_dict should parse since `validate` passed
            recap_type = from_dict(clean_dict(schema), registry)
            # aliases should fail, since we're expecting an invalid schema
            # and the metaschema is valid.
            with pytest.raises((ValueError, TypeError)):
                check_aliases(recap_type, registry)

Basically, check if validate works. If it does, then from_dict should also work, and check_aliases should fail. If validate fails, then no need to do anything else--it is an invalid schema.

PR up at #360