georgebv/pyextremes

Long timeseries support: thinking beyond pandas datetime range

dhirendrajnu opened this issue · 3 comments

Hi

I have been trying to use the 700 years worth of time series and the pandas apparently supports at max 584 years for datetime with the unit ns. I was trying to compute the return value stability plot (using pyextremesv2.3.0, and a POT approach) and getting the following error:

image

Is there any workaround for this? I understand that the function takes a pandas series with the date-time object as its index.

my series looks like this:
image

Any help is appreciated.

Kind regards,
Dhirendra

Wow, this is certainly something I've never seen before with pandas. One thing you could do (which is a hack) is to compress your time series - represent records as offsets relative to first timestamps and then reduce those offsets X times (e.g. 10). Then when you ask for a return period of '36.5D' you are actually looking at 1yr return period.

I don't know how to do it without a hack unfortunately - seems like a fundamental limitation of the way pandas represents timestamps.

That's really helpful. But what happens in case of the estimated return period using the get_return_periods? do I need to multiply the estimated return periods with the used offset? because the estimated return periods using this approach are quite smaller. I expect them to be 3.6 times larger roughly but not sure if I am correct?

You are correct. If you scale your time series, you should scale your return periods too. E.g., if you "squeeze" your time series 10x this means you need to "unsqueeze" your return periods 10x too - 1 year would be 36.5 days