georgebv/pyextremes

On GEV maximum likelihood estimation and MCMC with pyextremes

Goddysen opened this issue · 5 comments

I used the MATLAB 'gevfit' command to get the location parameter and the the scale parameter. These are the same as those fitted by 'pyextremes'(MCMC AND MLM), but the shape parameter k obtained from 'gevfit' is negative, and the shape parameter k' obtained from 'pyextremes'(MCMC AND MLM) are positive value which is the absolute value of the k by a lucky coincidence. Does the program 'pyextremes'(MCMC AND MLM) not set the display for parameter negative value? Or doI need to set something else?

THANKS! HOPE FOR YOU ANSWER!

@Goddysen pyextremes uses scipy.stats distributions internally, I suggest reading documentation here. Different software has different convention for shape parameter in GEVD (and others). You can always subclass scipy.stats.rv_continous if you want custom distribution (e.g. GEVD as it is implemented by Matlab) and pass your class to the model.

Another thing you should note is that pyextremes fits distributions to transformed extremes, not to extremes as they are. You can read more here: https://github.com/georgebv/pyextremes/blob/master/src/pyextremes/extremes/transformation.py

Thanks! I obtained the geV probability density form from SCIPY and learned that its convention with the general GEV probability density form about the sign of the shape parameter c is negative of shape parameter k.

GEV FORM
image

GEV FORM FROM THE SCIPY GEVEXTREMES
image

NOTE THAT THAT IS NOT SHAPE PARA,ETER K, BUT THE SIGN OF THE SHAPE PARAMETER c (A NEGATIVE OF K)!

THANKS FOR ANSWER!

I am using MCMC to fit GPD parameters, I firstly determine the threshold theta, then go to MCMC simulation and plot trace and corner, but why the plot has three parameter graphs instead of two parameter graphs?Can I not figure out the threshold and just figure out the other two parameters?

The code was listed as follows:

from pyextremes import EVA
extremes_2 = get_extremes(
ts=series,
method="POT",
extremes_type="high",
threshold=16,
r='24H',
)

model_7 = Emcee(
extremes=extremes_2,
distribution="genpareto",
distribution_kwargs=None,
n_walkers=100,
n_samples=500,
progress=False,
)

fig_12, ax_12 = plot_trace(
trace= model_7.trace,
trace_map=model_7.trace_map,
burn_in=0,
labels=[r"Shape, $\xi$",r"threshold, $\theta$", r"Scale, $\sigma$"],
)

fig_13, ax_13 = plot_corner(
trace=model_7.trace,
trace_map=model_7.trace_map,
burn_in=50,
labels=[r"Shape, $\xi$", r"threshold, $\theta$", r"Scale, $\sigma$"],
levels=5,
)

Thanks! Looking for your answer!

@Goddysen you are using the Emcee model wrong - extremes should be transformed and location parameter should be frozen, you can read more in the docstrings (I don't have documentation for those yet).

As for your problem, I suggest you use the EVA class as shown in the documentation quick start section:

model = EVA(series)
model.get_extremes("POT", threshold=16)
model.fit_model("Emcee")
model.plot_trace(burn_in=0)
model.plot_corner(burn_in=50)

If you want to use custom distribution, then you should make sure to understand the distribution_kwargs argument

distribution_kwargs : dict, optional
Special keyword arguments, passsed to the `.fit` method of the distribution.
These keyword arguments represent parameters to be held fixed.
Names of parameters to be fixed must have 'f' prefixes. Valid parameters:
- shape(s): 'fc', e.g. fc=0
- location: 'floc', e.g. floc=0
- scale: 'fscale', e.g. fscale=1
See documentation of a specific scipy.stats distribution
for names of available parameters.
By default, location parameter for 'genpareto' and 'expon' distributions
is fixed to threshold (POT) or to minimum extremes (BM) value.
Set to empty dictionary (distribution_kwargs={}) to avoid this behaviour.

This is done automatically in EVA. If you use Emcee on its own (without EVA) as you show then you need to provide this argument manually.

Thanks!