Unusual January 2022 data file in weekly repo is causing error
andybega opened this issue · 1 comments
andybega commented
On dataverse, the data for 2021 were recently (June 23rd) moved from the weekly repo to a single annual file for 2021. As part of that process, in the weekly repo, the events for January 2022 were packaged into a bigger monthly file. This is causing an error:
Error in list_local_files(raw_file_dir) :
unexpected non-data file(s) found in 'data/raw':
202201-icews-events.tab
In the weekly repo:
And see this new annual file in the annual repo:
(Handling the year transition is related to #61)
andybega commented
Progress, to:
> update_icews(dryrun = FALSE)
Ingesting records from 'events.2021.20220623.tab'
Error: UNIQUE constraint failed: events.event_id, events.event_date
There are entirely duplicated rows in the yearly file. Adding a check to remove those (with a warning).
> table(duplicated(foo))
FALSE TRUE
671235 5789