ccodwg/Covid19CanadaETL

Move away from PHAC weekly data in favour of active_cumul-based workflow

Closed this issue · 2 comments

Now that the PHAC epidemiology update is being published weekly and is thus very out-of-date (at the moment, going up to 2022-06-04, this will only changed on 2022-06-17), we should move away from using this dataset to update cases (NT, PE) and deaths (NT, PE, YT). NU will continue to be updated based on this dataset, since it no longer reports its own data, although it has not actually received updated data in months (but this may change in the future).

Naturally, since these datasets are only being updated weekly by the respective provinces/territories, they will have to use "as of dates" to ensure false precision is not given for update times.

None of the territories are reporting COVID-19 data anymore, so this part of the issue has been been superseded by ccodwg/CovidTimelineCanada#61 (pointing out that the NT case and death datasets could be partially reconstructed from archived data) and ccodwg/CovidTimelineCanada#90 (pointing out that the YT death dataset could be partially reconstructed from archived data).

MB has stopped reporting deaths (effectively, as pointed out by ccodwg/CovidTimelineCanada#88), so using this dataset is a necessity.

PE is the only relevant candidate, although the PEI dataset itself is ever-changing, updated weekly, and currently in a difficult format. Part of the time series could be reconstructed from historical data, I suppose. They publish their data on Tuesdays, including data up to the most recent Saturday, putting them far ahead of the PHAC update schedule.

Largely switched to a report-based workflow, due to the current nature of data availability, as well as the ease of report-based workflows compared to active_cumul.