asdf-format/asdf-standard

Overly permissive ndarray schema leads to improper time validation

braingram opened this issue · 1 comments

The ndarray schema accepts any object that does not have a source key (and some that do). For example:

import asdf
import asdf.tests


bad_string = """
!core/ndarray-1.0.0
 foo: 1
 bar: 2
"""

buff = asdf.tests.helpers.yaml_to_asdf(f"example: {bad_string.strip()}")

# confirm that validate on read is True
c = asdf.get_config()
c.validate_on_read = True

with asdf.AsdfFile() as af:
    try:
        af._open_impl(af, buff, mode='rw')
    except asdf.ValidationError as e:
        print("File appropriately failed to validate")

Does not produce a validation error and instead produces an exception during conversion to an array:

  File "/Users/bgraham/projects/221108_stdatamodel_bugs/asdf/asdf/tags/core/ndarray.py", line 395, in from_tree
    byteorder = node["byteorder"]
KeyError: 'byteorder

This schema is referenced in the time schema as part of an anyOf composition:
https://github.com/asdf-format/asdf-standard/blob/master/resources/schemas/stsci.edu/asdf/time/time-1.1.0.yaml#L127
which means that the subsequent object schema defined in time is not used and invalid yaml is produced for time objects that violate the time schema. For example, serializing

astropy.time.Time(730120.0003703703, format="plot_date")

produces

t: !time/time-1.1.0 {base_format: plot_date, value: '2000-01-01T00:00:32.000'}

which contains a base_format that is not listed in the time schema (this example was taken from the time tests in asdf-astropy some of which are passing because of this bug).

The "example" of a "bug" above in time-1.2.0 should be permissible this simply due to the bug fixed by #349. The larger issue is the original generated example.

Clearly, the example asdf data should fail the ndarray-1.0.0 schema; however, it does not. This is the true bug.