nhs-r-community/FunnelPlotR

winsorising seems to be incorrect...

aghaynes opened this issue · 1 comments

I might be wrong, but I think your implementation of winsorising is incorrect... or at least different to the methods described in the Spiegelhalter papers.

The supplementary file to Spiegelhalters BMJ paper (https://qualitysafety.bmj.com/content/qhc/suppl/2005/09/28/14.5.347.DC1/145347appendix.pdf) and the other paper you mention in the help file, both state "set the lowest 100q% of z-scores to Zq, and the highest 100q% of z-scores to Z1-q" and (the supplement at least) follows with "this retains the same number of z-scores but discounts the influence of outliers". You seem to use trimming instead... https://github.com/chrismainey/FunnelPlotR/blob/c9029f5906594f5310ca21f59a29a596e5857369/R/OD_adjust_func.R#L123-L124

Sorry for the delayed reply. I thought I'd be notified of issues posted, but seems I've not got that set up right.

You're describing the winsorisation correctly, but there are two methods of outlier adjustment available in the plot. The 'CQC' method is the one you are describing, which uses a square-root transformation and the winsorisation you describe, that's dealt with a bit earlier in the function:

https://github.com/chrismainey/FunnelPlotR/blob/c9029f5906594f5310ca21f59a29a596e5857369/R/OD_adjust_func.R#L48-L53

The second method uses a natural log-transformation and truncates rather than winsorises (based on the methods in https://digital.nhs.uk/SHMI .

https://github.com/chrismainey/FunnelPlotR/blob/c9029f5906594f5310ca21f59a29a596e5857369/R/OD_adjust_func.R#L123-L128

You can switch between them with the 'method' argument, but the default is 'SHMI'. I've found that, with highly overdispersed data, the winsorization and truncation method can lead to limits that are a bit erratic, so that's why it's not the default.

Let me know if you think it needs more descriptions in the support material though (or alter it yourself and send a pull request if you like). Hope that helps.