minvws/nl-covid19-data-dashboard

Use geometric mean in moving average

htot opened this issue · 1 comments

htot commented

For exponential growth or decays processes ordinary moving average gives a slight distortion. Just as plotting on a logarithmic scale gives better inside in the growth process, averaging should be done after taking the log of the data.

Mathematically equivalent is taking the geometric mean before taking the log, and is to be preferred in all cases.

While considering this: a 7 day moving average leads to a 3.5 day delay in the resulting curve. If the dates are shifted back 3 days a much better correlation with government measures results (this in particular with plotting on log Y-axis).

Hi! Thanks for the suggestion. We do not use logarithmic scales or calculations like the geometric mean for the dashboard because our main user is 'the average resident of the Netherlands'. We feel that this type of data visualisation, although valuable to those who know how to 'read' it, is too technical and reduces comprehension for our main audience.

In our user research we already see some users struggling to read some of the visualisations and we continuously work on portraying an easy to grok yet accurate representation of the data. The main reason why we've added the 7-day moving average for example is to help users make more sense of the volatile ups and downs of the daily numbers. Whilst you can almost always be more precise, we feel the current approach strikes a good balance.

With our approach shifting the dates is a mostly cosmetic choice, we view neither approach as having strong cons or benefits. Some outlets visualise it one way, some another. We've considered both and chose this way of doing it.

Hopes this illuminates some of the reasoning behind all this! Again, thanks for taking the time to make a suggestion. That helps us to consider various viewpoints.