Checking changes/`git diff` in test dataset expected outputs is currently tedious and prone to error
Closed this issue · 2 comments
Right now any time I regenerate the expected outputs for test datasets, I have to review the git diff
to check if there are any unexpected changes. The way I'm currently doing it is reviewing the csv files on VS Code using their inbuilt comparison tools, but even then it's difficult to see what has changed because it's in text format, i.e. comma delimited "val1,val2,,,,val3", etc.
I need to figure out a better way to view changes in csv files, maybe a VS Code extension. Also this package lumberjack
may be useful (it logs changes in data).
This looks promising:
https://github.com/paulfitz/daff
Follow instructions to install this:
https://github.com/paulfitz/daff
After running this in git bash daff git csv
(only need to run once to configure your git), you can then run git diff and save it to a csv and the git diff will be in daff
formatting.
git diff SHA1..SHA2 --color > diff.csv
Then run:
daff render --output output.html diff.csv
to render it as an html (can edit CSS for readability).