lbelzile/mev

ext.index

Closed this issue · 3 comments

This function presents problems when I use apply or within a cicle. For example, for 1000x500 matrix (samples are coluns) it calculates the first 10 but no more. I've tried with a cicle in stead and I realize that sometimes it does not calculate but I don't understand why. Has any one tried the same problem? I wanted to compare it with other estimates, calculating the mean value for some number of replications...

Apologies Cristina for not seeing this earlier and thanks for filing the issue. If I understand you correctly, the function returns an error when using a for-loop or apply.

Is this the error you see?

Error in lm.wfit(x, y, w, offset = offset, singular.ok = singular.ok, : NA/NaN/Inf in 'x'

It was caused by an off-by-one error in the number of exceedances, resulting in an Inf weight being passed to the function lm. Now fixed.

If not, please provide a minimal reproducible example.

The following should work with the latest version on Github.

samp <- rmev(n = 1000L, d = 50L, param = 0.2, model = "log")
apply(samp, 2, function(x){ext.index(x = x, q = c(0.5, 0.6, 0.7), method= "wls"))})

The algorithm in Süveges (2007) is a bit unclear, but alternates two steps until convergence. It seems that the procedure doesn't always work in small samples. The problem comes from the fact that, sometimes, the number of spacings used to fit the exponential is 1, in which case the routine breaks down (the estimator is obtained by fitting least squares with an intercept and an additional parameter).

The weighted least square routine should work most of the time, but if your thresholds are too high, the fit will fail (and there are plenty of good reasons then for it to fail). There are also cases where there are no gaps to use to compute an estimator, regardless of the method. I now catch these "errors" and return NAs with warnings.

In summary,

  1. Function now return NAs if (1) the routine is divergent (number of gaps used to fit the model is less than or equal to 1).
  2. If there are only zero gaps, the function returns 0 with a warning
  3. Return NA if there are less than one exceedance over the threshold (e.g., because of extrapolation with the largest values).