datafold/data-diff

Hasdiff table behaviour changed

saurasingh opened this issue · 2 comments

Describe the bug
We recently updated from data-diff version 0.8.5rc1 to 0.9.12 and we noticed a difference in the behaviour of how hashdiff_table does Diffing over bisections. Version 0.8.5rc1 is used to go through all top-level bisections to find diff and at the end will go through the segments where diff was identified. Attached log file 0.8.5rc1.log
0.8.5rc1.log

But, when we started using version 0.9.12, as soon as it started the top-level bisection even if there is no diff it will go in every top-level bisection to look for diff like, and this is adding time complexity to the process. Attached log file 0.9.12.log
0.9.12.log

Both these logs are against the same table.

This issue has been marked as stale because it has been open for 60 days with no activity. If you would like the issue to remain open, please comment on the issue and it will be added to the triage queue. Otherwise, it will be closed in 7 days.

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment and it will be reopened for triage.