Cause of Death List Tidying
jmotis opened this issue · 2 comments
"Found dead" and "Found Dead" should be merged as causes of death; similarly "Other Casualties" and "Other" probably should be merged but you should double-check that one.
Is this the kind of thing worth trying to catch before an export leaves DataScribe, or do we want to make it part of the general data tidying that happens post-export?
If we go with post-export cleaning, the easiest here might be to just have the data tidying script lowercase everything so we get just the distinct values. That would mean everything in the filtering panes on the website would likewise be lowercase instead of title case as it is now (although . Same with "Other" vs "Other Casualties," although in that case the script would have to do some string replacement. Either way works for me.
Looking at the data csvs, it seems like the "Found d/Dead" issue may have been baked into the BOM Form (one uppercase - Laxton regular; one lowercase - Laxton 1700; one rogue - Wellcome weekly bills where we didn't do it as a separate form field). The Other/Other Casualties might also be from the Wellcome bills. If we can just manually update what's on the site now, I can fix future exports.