corona-zahlen-landkreis/corona_landkreis_fallzahlen_scraping

A note on data inconsistencies

Closed this issue · 2 comments

I have noticed that there are data inconsistencies for example in Soest's kommunen. These do not come from wrong parsing, but from changed official numbers.

We just use the reported number, it doesn't matter if they go down in between or at the same date(if they go down in the official numbers) It is up to the data consumer how to handle these things.

Also sometimes a time is included, sometimes not.

Do not add wrong, not existant times (like 12:00 or 00:00)

Do not add timezone data, always use lokal time in format %Y-%m-%d %H:%M or %Y-%m-%d

If some LK changes it website, so it becomes unparsable, the corresponding csv will not be deleted, but not updated automatically