mr-mixas/Nested-Diff.py

suggest nested-diff ignore_nan feature.

Closed this issue · 3 comments

Hello Michael,

I really like your nested-diff here and it is nice and clean, specially the text formatter feature.

However, I found the diff function currently does not support ignoring the nan’s while comparing items. Is it possible that this feature can be added to the next release?

Here is my pesudo suggestions:
if ignore_nan and isinstance(x, float) and str(x) == str(y) == 'nan':
then not store in "D"

Thank you.

Hi,

Sorry for long silence and thanks for the kind words!
It's nice to hear nested-diff is useful =)

diff function is here mostly for backward compatibility and highly likely will be deprecated in the future.
There is no way to provide options for all kind of possible data types. That's why handlers appeared =)

You can have desired behavior (nan equals to nan) by using Differ object directly and custom handler:

>>> from math import isnan
>>> from nested_diff import Differ, handlers
>>> 
>>> 
>>> class CustomFloatHandler(handlers.FloatHandler):
...     def diff(self, differ, a, b):
...         if isnan(a) and isnan(b):
...             if differ.op_u:
...                 return True, {'U': a}
...             return True, {}
...             
...         return super().diff(differ, a, b)
... 
>>> 
>>> differ = Differ()
>>> differ.set_handler(CustomFloatHandler())
>>>  
>>> differ.diff(float('nan'), float('nan'))   
(True, {'U': nan})

Hi,

I thought again, and again, and... you are right :)
There should be builtin opt for ignoring difference between NaN instances.

I added nans_equal opt to float handler and extra_handlers opt to diff.
Since v1.2 NaNs difference ignoring using diff function is just

>>> from nested_diff import diff, handlers
>>> 
>>> diff(float('nan'), float('nan'), extra_handlers=[handlers.FloatHandler(nans_equal=True)])
{'U': nan}
>>> 

Hope it helps =)