asdf-format/asdf

asdf.commands.diff will report that contents differ for any arrays including NaNs

schlafly opened this issue · 1 comments

I think we want asdf.commands.diff to not report differences when comparing two identical arrays which both contain NaNs in the same positions. This comparison
https://github.com/asdf-format/asdf/blob/main/asdf/commands/diff.py#L268-L269
needs to have the equal_nan=True argument for that to occur. Otherwise arrays including NaNs will always report differences here.

Only vaguely related to this issue, but I was also hoping for more verbose output from asdf.commands.diff. In the particular file I'm looking at:

>>> asdf.commands.diff([
    'test_ramp_fitting_step0/truth/r0000101001001001001_01101_0001_WFI01_rampfit.asdf',
    'test_ramp_fitting_step0/r0000101001001001001_01101_0001_WFI01_rampfit.asdf'],
    minimal=False, ignore=['roman.cal_logs', 'asdf_library', 'history'])
        ndarrays differ by contents

I get only "ndarrays differ by contents." I would have appreciated being told something more like 'tree['roman']['data'] ndarrays differ by contents' so that I knew more than just some array somewhere in the file differed.

Thanks for bringing this up. I just had a brief discussion with @nden about this and other asdf diff changes. I'm going to try converting this to a "discussion" (in part to see how that works) but also to solicit input from @perrygreenfield and @eslavich on what verbose output might look like.

It sounds like the changes so far are (and I might update this portion of this comment as the discussion continues):

  • allow option to specify equal_nan=True for array comparisons
  • create a more verbose output for array differences