georgebv/pyextremes

Confidence interval question

TanguyRiou opened this issue · 3 comments

I did a block maxima with a block period of one year and I fitted a gumbel distribution (this is snow data FYI). I used a strong confidence interval of 0.999.

block_size="365.2425D"
confidence_interval = 0.999
model = EVA(series)
model.get_extremes(method="BM", block_size=block_size, errors='ignore')
model.fit_model(model='MLE', distribution='gumbel_r')
model.plot_diagnostic(alpha=confidence_interval)

image

I was excpecting the confidence interval to be more like the one in the quickstart (picture below) https://georgebv.github.io/pyextremes/quickstart/
image

Do you know why there is a "linear" confidence interval ?

I have scipy 1.10.1 and pyextremes 2.2.7.

The reason is in the choice of distribution - you use Gumbel distribution which is exponential and appears linear when plotted at logarithmic scale. Also, if it's so tight at 99.9% confidence, this means that your choice of model was correct and that you have good data.

Ok thanks for your answer. I think I am misunderstanding the confidence interval. In the book "An introduction to statistical modeling of extreme values" of S.Coles they give examples where confidence interval is computed based on a normal distribution. I saw in #36 that you are using a monte carlo method. If you have any details of this computation it would be amazing. I tried to dig in your source code but I admit that I did not understand everything.

Monte Carlo idea is really simple - you sample from your data (this way you have a "different" dataset), fit model to it, then repeat this process many times. In the end you have a large number of curves - one for each sample. To get 95% confidence interval you then take those curves and for each return period calculate return values and remove top and bottom 2.5% and then draw curves through those points.