Check for data integrity
BennyThadikaran opened this issue · 0 comments
BennyThadikaran commented
A user contacted me on Friday with an error while running EOD2 sync. The error seem to point to duplicate entries in one of the csv files.
I couldn't get the source of the error, but i decided to check my own data for any such issues.
- No duplicate entries found in
daily
ordelivery folders
. - There was however an extra column
DELIV_PER
in some delivery files. Source of the error was insrc/defs/defs.py
in theheader_text
variable. This defines the column headers for delivery files whenever a new file is created.- Code corrections were made and 182 files with this extra column was cleaned up.
- 5 files with just column headers and no data were deleted.
I have written a script diagnostics.py
which will look for common errors. I intend to run this weekly before updating data on the repo.
Users can use diagnostics.py to run checks on their own data and report issues.