georgebv/pyextremes

model.get_summary and model.plot_diagnostics taking a long time

andersdot opened this issue · 9 comments

Hi

I'm following your Quick Start tutorial with my own data, and I notice for a few time series the methods model.get_summary and model.plot_diagnostics hang, taking over 30 minutes before I kill the cell. I've checked that the time series is uniformly spaced in time and has all valid values between 0 and 40. I'm not sure what's going on, are there diagnostics I could run? Or do you have intuition why this might be happening? Cheers

If it's helpful this is what model.extremes looks like

Date_Time
1999-04-21 22:52:00    21.1
2000-05-07 22:52:00    20.3
2001-11-22 19:52:00    20.2
2002-03-13 22:52:00    21.2
2003-06-23 21:52:00    19.0
2004-05-10 22:52:00    20.6
2005-03-28 22:52:00    20.3
2006-04-05 20:52:00    19.7
2007-04-12 17:52:00    22.7
2008-06-04 22:52:00    22.4
2009-03-22 17:52:00    20.8
2010-12-30 03:52:00    19.8
2011-02-16 20:52:00    20.8
2012-03-06 22:52:00    24.0
2013-04-08 16:52:00    20.6
2014-05-10 23:52:00    21.4
2015-04-25 20:52:00    21.9
2016-04-25 20:52:00    21.0
2017-03-30 21:52:00    24.5
2018-04-12 23:52:00    20.5
2019-11-25 20:52:00    22.0
2020-06-28 20:52:00    20.5
2021-10-11 22:52:00    22.6
2022-01-22 04:52:00    20.0
Name: WG, dtype: float64

And this is model.extremes for a model that fit quickly

Date_Time
1999-10-07 05:52:00    22.6
2000-06-20 04:52:00    23.4
2001-05-06 01:52:00    22.3
2002-10-17 05:52:00    23.8
2003-09-21 05:52:00    24.8
2004-10-06 02:52:00    23.9
2005-09-30 01:52:00    21.5
2006-08-22 02:52:00    23.0
2007-07-01 06:52:00    22.9
2008-09-18 05:52:00    21.4
2009-11-18 04:52:00    24.8
2010-07-12 04:52:00    19.4
2011-07-28 04:52:00    20.6
2012-11-28 05:52:00    28.0
2013-05-08 06:52:00    21.9
2014-07-07 06:52:00    23.8
2015-03-14 04:52:00    19.7
2016-08-17 03:52:00    24.4
2017-09-22 05:52:00    24.0
2018-05-24 15:52:00    22.9
2019-06-10 03:52:00    23.2
2020-05-24 09:52:00    23.6
2021-07-26 11:52:00    25.3
2022-01-08 12:52:00    16.5
Name: WG, dtype: float64

@andersdot extracted extreme values look normal, it is impossible to say more based on information you provided. I would need to reproduce your issue and for that I need to repeat the same steps - please provide your code (only portion related to pyextremes is required) and all inputs provided to it.

I found an error in my Jupyter notebook log that said AttributeError: 'gumbel_r_gen' object has no attribute 'wrapper' and noticed you had written to scipy about this issue. I have a minimal working example with a 52MB .csv file and .ipynb file but github won't allow me to attach them. Can I email them to you? Cheers

@andersdot what versions of pyextremes and scipy are you using? I fixed that issue in both libraries since July 2021.

scipy version 1.7.3 and pyextremes version 2.0.0

Please use pyextremes version 2.2.2 or later, this error was fixed in that version.

pip doesn't seem to have a version > 2.0.0 but I cloned the git repo and installed from source and that worked like a charm. Thanks!

You should be able to install using both pip and anaconda. You can see PyPI version history here: https://pypi.org/project/pyextremes/#history

You may have other packages in your environment with pinned dependencies which may be blocking the update